You are on page 1of 361

Design for Operations

Designing Manageable
Applications
August 2008
Table of Contents
Introduction .................................................................................................................................... 12
Intended Audiences ................................................................................................................ 12
How This Guide Is Organized ................................................................................................ 12
Chapter Outline ...................................................................................................................... 13
Scenarios Discussed in This Guide ....................................................................................... 14
Worked Example Used in This Guide .................................................................................... 14
Northern Electronics Shipping Application ......................................................................... 14
The Dynamic IT Systems Initiative (DSI) ............................................................................... 15
Patterns and Practices ........................................................................................................... 15
Feedback and Support ........................................................................................................... 15
Acknowledgments .................................................................................................................. 15
Section 1 ........................................................................................................................................ 17
Introduction to Manageable Applications ................................................................................... 17
Chapter 1 ....................................................................................................................................... 18
Understanding Manageable Applications .................................................................................. 18
Application Perspectives ........................................................................................................ 19
Operating Business Applications............................................................................................ 19
Application Dependencies ...................................................................................................... 20
Core Principles for Designing Manageable Applications ....................................................... 22
Northern Electronics Scenario................................................................................................ 23
Operations Challenges ....................................................................................................... 24
Development Challenges.................................................................................................... 25
Summary ................................................................................................................................ 25
Chapter 2 ....................................................................................................................................... 26
A High-Level Process for Manageable Applications .................................................................. 26
Roles Participating in the High-Level Process ....................................................................... 27
Understanding the Process .................................................................................................... 29
Designing the Manageable Application .............................................................................. 29
Developing the Manageable Application ............................................................................ 30
Deploying the Manageable Application .............................................................................. 30
Operating the Manageable Application .............................................................................. 30
Facilitating the Process Guidance and Artifacts.................................................................. 31
Northern Electronics Scenario................................................................................................ 33
Summary ................................................................................................................................ 34
Section 2 ........................................................................................................................................ 36
Architecting for Operations ........................................................................................................ 36
Chapter 3 ....................................................................................................................................... 37
Architecting Manageable Applications ....................................................................................... 37
Designing Manageable Applications ...................................................................................... 37
Representing Applications as Managed Entities .................................................................... 39
Advantages of Using Managed Entities ................................................................................. 40
Providing an Operations View of an Application................................................................. 41
Ensuring That Instrumentation Is Sufficient ........................................................................ 41
Close Mapping to Configuration ......................................................................................... 41
Benefits of Defining a Management Model for the Application .............................................. 42
Designing, Developing, Deploying, and Maintaining Manageable Applications: Refining the
Process................................................................................................................................... 42
Northern Electronics Scenario................................................................................................ 43
Summary ................................................................................................................................ 44
Chapter 4 ....................................................................................................................................... 45
Creating Effective Management Models .................................................................................... 45
Benefits of Using Management Models ................................................................................. 46
Management Model Views ..................................................................................................... 46
Comprehensive Management Models ................................................................................... 47
Configuration Modeling ....................................................................................................... 47
Task Modeling .................................................................................................................... 48
Instrumentation Modeling ................................................................................................... 49
Health Modeling .................................................................................................................. 49
Performance Modeling ........................................................................................................ 50
Modeling Instrumentation and Health .................................................................................... 51
Effective Instrumentation Modeling .................................................................................... 51
Types of Instrumentation................................................................................................. 51
Performance Counters .................................................................................................... 52
Events ............................................................................................................................. 52
Determining What to Instrument ......................................................................................... 53
Granularity of Instrumentation ......................................................................................... 54
Performance Considerations........................................................................................... 54
Building Effective Health Models ............................................................................................ 54
Health States ...................................................................................................................... 55
Health State Hierarchies ..................................................................................................... 56
Managed Entity Hierarchies ............................................................................................ 56
Aggregate Aspects .......................................................................................................... 58
Rolling Up Aspects into Managed Entities ...................................................................... 59
Monitoring and Troubleshooting Workflow ......................................................................... 60
Detection ......................................................................................................................... 60
Verification....................................................................................................................... 61
Diagnostics ...................................................................................................................... 61
Resolution ....................................................................................................................... 62
Re-verification ................................................................................................................. 62
Structure of a Health Model ................................................................................................ 62
Mapping Requirements to Individual Indicators.................................................................. 64
Multiple Distributed Managed Entities ................................................................................ 64
Northern Electronics Scenario................................................................................................ 65
Instrumentation Model ........................................................................................................ 66
Health Model ....................................................................................................................... 66
Summary ................................................................................................................................ 68
Chapter 5 ....................................................................................................................................... 69
Proven Practices for Application Instrumentation ...................................................................... 69
Events and Metrics ................................................................................................................. 69
Architectural Principles for Effective Instrumentation ............................................................. 69
Create a Flexible Instrumentation Architecture .................................................................. 70
Create Instrumentation That Operations Staff Easily Understands.................................... 70
Support Existing Operations Processes and Tools ............................................................ 70
Create Applications That Are Not Self-Monitoring .............................................................. 71
Support Flexible Configuration of Instrumentation ............................................................. 71
Using Instrumentation Levels to Specify Instrumentation Granularity ............................ 71
Using Infrastructure Trust Levels to Specify Instrumentation Technologies ................... 73
Designing Application Instrumentation ................................................................................... 73
Use the Capabilities of the Underlying Platform ................................................................. 73
Provide Separate Instrumentation for Each Purpose ......................................................... 74
Isolate Abstract Instrumentation from Specific Instrumentation Technologies ................... 74
Create an Extensible Instrumentation Architecture ............................................................ 75
Use Base Events for Instrumentation ................................................................................. 75
Use Event Names and Event IDs Consistently .................................................................. 75
Ensure Events Provide Backward Compatibility................................................................. 76
Support Logging to Remote Sources ................................................................................. 76
Consider Distributed Event Correlation .............................................................................. 76
Developing the Instrumentation.............................................................................................. 76
Minimize Resource Consumption ....................................................................................... 77
Consider the Security of the Event Information .................................................................. 77
Supply Appropriate Context Data ....................................................................................... 77
Record the Times Events Are Generated ........................................................................... 77
Provide Resolution Guidance ............................................................................................. 78
Building and Deploying Instrumentation ................................................................................. 78
Automate Implementation of Instrumentation ..................................................................... 78
Automate the Build and Deploy Process ............................................................................ 78
Monitor Applications Remotely ........................................................................................... 78
Summary ................................................................................................................................ 79
Chapter 6 ....................................................................................................................................... 80
Specifying Infrastructure Trust Levels........................................................................................ 80
Infrastructure Model Scenarios .............................................................................................. 81
In-House Application Scenario ........................................................................................... 81
ISV or Shrink-Wrap Application Scenario ........................................................................... 81
Privilege and Trust Considerations ........................................................................................ 82
Tools for Infrastructure Modeling............................................................................................ 84
Standalone Tools ................................................................................................................ 85
Integrated Tools .................................................................................................................. 85
Infrastructure Modeling with the TSMMD ........................................................................... 85
Instrumentation Technologies Supported by the TSMMD .............................................. 86
Northern Electronics Scenario................................................................................................ 87
Summary ................................................................................................................................ 87
Chapter 7 ....................................................................................................................................... 88
Specifying a Management Model Using the TSMMD Tool ........................................................ 88
Requirements for the TSMMD................................................................................................ 88
Creating a Management Model .............................................................................................. 89
The TSMMD Guided Experience ........................................................................................ 89
Creating the TSMMD File ................................................................................................... 89
Graphically Modeling an Operations View of the Application ............................................. 91
Executable Application .................................................................................................... 92
Windows Service ............................................................................................................. 92
ASP.NET Application ...................................................................................................... 93
ASP.NET Web Service ................................................................................................... 94
Windows Communication Foundation (WCF) Service .................................................... 95
Defining Target Environments for the Application .............................................................. 96
Defining Instrumentation for the Application ....................................................................... 97
Defining Abstract Instrumentation ................................................................................... 97
Defining Instrumentation Implementations .................................................................... 100
Discovering Existing Instrumentation in an Application .................................................... 105
Creating Health Definitions ............................................................................................... 109
Validating the Management Model ................................................................................... 111
Management Model Guidelines............................................................................................ 112
Northern Electronics Scenario.............................................................................................. 112
Summary .............................................................................................................................. 115
Section 3 ...................................................................................................................................... 116
Developing for Operations ....................................................................................................... 116
Chapter 8 ..................................................................................................................................... 117
Creating Reusable Instrumentation Helpers ............................................................................ 117
Creating Instrumentation Helper Classes ............................................................................ 117
Instrumentation Solution Folder ........................................................................................... 118
API Projects ...................................................................................................................... 119
Technology Projects ......................................................................................................... 120
Event Log Project .......................................................................................................... 120
Windows Eventing 6.0 Project ...................................................................................... 121
WMI Project ................................................................................................................... 121
Performance Counter Project........................................................................................ 121
Using the Instrumentation Helpers ....................................................................................... 121
Verifying That Instrumentation Code is called from the Application..................................... 122
Summary .............................................................................................................................. 123
Chapter 9 ..................................................................................................................................... 124
Event Log Instrumentation ....................................................................................................... 124
Installing Event Log Functionality ......................................................................................... 125
Event Sources .................................................................................................................. 125
Using the EventLogInstaller class .................................................................................... 127
Writing Events to an Event Log ............................................................................................ 129
Using the WriteEntry Method ............................................................................................ 129
The WriteEvent Method .................................................................................................... 130
Reading Events from Event Logs ......................................................................................... 131
Creating and Configuring an Instance of the EventLog Class.......................................... 131
Using the Entries Collection to Read the Entries.............................................................. 132
Clearing Event Logs ............................................................................................................. 133
Deleting Event Logs ............................................................................................................. 133
Removing Event Sources ..................................................................................................... 134
Creating Event Handlers ...................................................................................................... 135
Using Custom Event Logs .................................................................................................... 135
Writing to a Custom Log ................................................................................................... 136
Installing the Custom Log.............................................................................................. 136
Writing Events to the Custom Log ................................................................................ 137
Other Custom Log Tasks .................................................................................................. 137
Summary .............................................................................................................................. 137
Chapter 10 ................................................................................................................................... 138
WMI Instrumentation ................................................................................................................ 138
WMI and the .NET Framework ............................................................................................. 138
Benefits of WMI Support in the .NET Framework............................................................. 139
Limitations of WMI in the .NET Framework ...................................................................... 140
Using WMI.NET Namespaces .......................................................................................... 141
Publishing the Schema for an Instrumented Assembly to WMI ........................................... 142
Republishing the Schema ................................................................................................. 143
Unregistering the Schema ................................................................................................ 143
Instrumenting Applications Using WMI.NET classes ........................................................... 143
WMI .NET Classes ........................................................................................................... 144
Accessing WMI Data Programmatically ............................................................................... 144
Summary .............................................................................................................................. 145
Chapter 11 ................................................................................................................................... 147
Windows Eventing 6.0 Instrumentation ................................................................................... 147
Windows Eventing 6.0 Overview .......................................................................................... 147
Reusable Custom Views ................................................................................................... 147
Command Line Operations ............................................................................................... 148
Event Subscriptions .......................................................................................................... 149
Integration with Task Scheduler ....................................................................................... 149
Online Event Information .................................................................................................. 150
Publishing Windows Events ................................................................................................. 150
Event Types and Event Channels .................................................................................... 151
Event Types and Channel Groups ................................................................................ 151
Serviced Channel .......................................................................................................... 152
Direct Channel .............................................................................................................. 152
Channels Defined in the Winmeta.xml File ................................................................... 152
Creating the Instrumentation Manifests ............................................................................ 153
Elements in the Instrumentation Manifest ..................................................................... 153
Using Templates for Events .......................................................................................... 157
Using the Message Compiler to produce development files ............................................ 157
Writing Code to Raise Events ........................................................................................... 158
Compiling and Linking Event Publisher Source Code ...................................................... 162
Installing the Publisher Files ............................................................................................. 162
Consuming Event Log Events .............................................................................................. 163
Querying for Events .......................................................................................................... 163
Querying Over Active Event Logs .................................................................................... 163
Querying Over External Files............................................................................................ 163
Reading Events from a Query Result Set ......................................................................... 164
Subscribing to Events ....................................................................................................... 164
Push Subscriptions ........................................................................................................... 165
Pull Subscriptions ............................................................................................................. 168
Summary .............................................................................................................................. 171
Chapter 12 ................................................................................................................................... 172
Performance Counters Instrumentation ................................................................................... 172
Performance Counter Concepts ........................................................................................... 172
Categories......................................................................................................................... 172
Instances........................................................................................................................... 173
Types ................................................................................................................................ 173
Installing Performance Counters .......................................................................................... 174
Writing Values to Performance Counters ............................................................................. 176
Connecting to Existing Performance Counters .................................................................... 178
Performance Counter Value Retrieval ................................................................................. 178
Raw, Calculated, and Sampled Data ................................................................................ 178
Comparing Retrieval Methods .............................................................................................. 180
Summary .............................................................................................................................. 180
Chapter 13 ................................................................................................................................... 181
Building Install Packages ......................................................................................................... 181
Section 4 ...................................................................................................................................... 182
Managing Operations ............................................................................................................... 182
Chapter 14 ................................................................................................................................... 183
Deploying and Operating Manageable Applications ................................................................ 183
Deploying the Application Instrumentation ........................................................................... 183
Running the Instrumented Application ................................................................................. 183
Event Log Instrumentation ................................................................................................ 185
Performance Counter Instrumentation ............................................................................. 187
WMI................................................................................................................................... 188
Trace File Instrumentation ................................................................................................ 188
Summary .............................................................................................................................. 189
Chapter 15 ................................................................................................................................... 190
Monitoring Applications ............................................................................................................ 190
Distributed Monitoring Applications ...................................................................................... 190
Management Packs.............................................................................................................. 192
Rules and Rule Groups ........................................................................................................ 192
Monitoring the Example Application ..................................................................................... 196
Monitoring the Remote Web Service ................................................................................... 201
Summary .............................................................................................................................. 209
Chapter 16 ................................................................................................................................... 210
Creating and Using Microsoft Operations Manager 2005 Management Packs ...................... 210
Importing a Management Model from the MMD into Operations Manager 2005 ................. 210
Viewing the Management Pack ........................................................................................ 212
Guidelines for Importing a Management Model from the Management Model Designer . 218
Creating and Configuring a Management Pack in the Operations Manager 2005
Administrator Console .......................................................................................................... 219
Guidelines for Creating and Configuring a Management Pack in the Operations Manager
2005 Administrator Console ............................................................................................. 232
Editing an Operations Manager 2005 Management Pack ................................................... 233
Editing Rule Groups and Subgroups ................................................................................ 233
Editing Event Rules, Alert Rules, and Performance Rules ............................................... 236
Editing Computer Groups and Rollup Rules..................................................................... 242
Creating and Editing Operators, Notification Groups and Notifications ........................... 246
Viewing and Editing Global Settings ................................................................................. 249
Guidelines for Editing an Operations Manager 2005 Management Pack ........................ 251
Create an Operations Manager 2005 Computer Group and Deploy the Operations Manager
Agent and Rules ................................................................................................................... 251
Guidelines for Creating an Operations Manager 2005 Computer Group and Deploying the
Operations Manager Agent and Rules ............................................................................. 257
View Management Information in Operations Manager 2005 .............................................. 258
Guidelines for Viewing Management Information in Operations Manager 2005 .............. 266
Create Management Reports in Operations Manager 2005 ................................................ 267
Guidelines for Creating Management Reports in Operations Manager 2005 .................. 269
Summary .............................................................................................................................. 269
Chapter 17 ................................................................................................................................... 270
Creating and Using System Center Operations Manager 2007 Management Packs ............. 270
Convert and Import a Microsoft Operations Manager 2005 Management Pack into
Operations Manager 2007.................................................................................................... 270
Guidelines for Converting and Importing a Microsoft Operations Manager 2005
Management Pack into Operations Manager 2007 .......................................................... 271
Creating a Management Pack in the Operations Manager 2007 Operations Console ........ 272
Guidelines for Creating a Management Pack in the Operations Manager 2007 Operations
Console ............................................................................................................................. 294
Editing an Operations Manager 2007 Management Pack ................................................... 295
Guidelines for Editing an Operations Manager 2007 Management Pack ........................ 303
Deploying the Operations Manager 2007 Agent .................................................................. 303
Best Practices for Deploying the Operations Manager 2007 Agent ................................. 306
Viewing Management Information in Operations Manager 2007 ......................................... 306
Guidelines for Viewing Management Information in Operations Manager 2007 .............. 311
Creating Management Reports in Operations Manager 2007 ............................................. 312
Guidelines for Creating Management Reports in Operations Manager 2007 .................. 315
Summary .............................................................................................................................. 316
Section 5 ...................................................................................................................................... 317
Technical References .............................................................................................................. 317
Appendix A .................................................................................................................................. 318
Building and Deploying Applications Modeled with the TSMMD ........................................... 318
Consuming the Instrumentation Helper Classes .................................................................. 318
Verifying Instrumentation Coverage ..................................................................................... 320
Deploying the Application Instrumentation ........................................................................... 322
Installing Event Log Functionality ..................................................................................... 322
Installing Windows Eventing 6.0 Functionality.................................................................. 323
Publishing the Schema for an Instrumented Assembly to WMI ....................................... 323
Installing Performance Counters ...................................................................................... 323
Using a Batch File to Install Instrumentation .................................................................... 324
Using the Event Messages File ........................................................................................ 324
Specifying the Runtime Target Environment and Instrumentation Levels ........................... 324
Generating Management Packs for System Center Operations Manager 2007 ................. 328
Importing a Management Pack into System Center Operations Manager 2007 ................. 330
Prerequisite Management Packs ...................................................................................... 331
Creating a New Distributed Application ................................................................................ 331
Appendix B .................................................................................................................................. 333
Walkthrough of the Team System Management Model Designer Power Tool ........................ 333
Building a Management Model ................................................................................................ 333
Generating the Instrumentation Code ...................................................................................... 347
Testing the Model with a Windows Forms Application ............................................................ 349
Generating an Operations Manager 2007 Management Pack ................................................ 353
Appendix C .................................................................................................................................. 355
Performance Counter Types .................................................................................................... 355
Copyright Information
Information in this document, including URL and other Internet Web site references, is subject
to change without notice. Unless otherwise noted, the companies, organizations, products,
domain names, e-mail addresses, logos, people, places, and events depicted in examples herein
are fictitious. No association with any real company, organization, product, domain name, e-
mail address, logo, person, place, or event is intended or should be inferred. Complying with all
applicable copyright laws is the responsibility of the user. Without limiting the rights under
copyright, no part of this document may be reproduced, stored in or introduced into a retrieval
system, or transmitted in any form or by any means (electronic, mechanical, photocopying,
recording, or otherwise), or for any purpose, without the express written permission of
Microsoft Corporation.
Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual
property rights covering subject matter in this document. Except as expressly provided in any
written license agreement from Microsoft, the furnishing of this document does not give you
any license to these patents, trademarks, copyrights, or other intellectual property.
Microsoft, Windows, System Center Operations Manager, C#, Visual Basic, Visual Studio, and
Team System are trademarks of the Microsoft group of companies.
All other trademarks are property of their respective owners.
© 2008 Microsoft Corporation. All rights reserved.
Introduction
Welcome to Design for Operations: Designing Manageable Applications – January 2008 Release.
This guide describes how to create applications that are easier to manage than existing
applications. When used alongside the associated code artifacts, this guide should help
dramatically simplify the process of creating manageable applications, and therefore reduce the
costs associated with application operations.

Intended Audiences
This guide is designed for people involved in designing, developing, testing, deploying, and
operating business applications. These include people in the following roles:
• Solutions architects
• Infrastructure architects
• Developers
• Senior operators

People in each role are likely to use the guide in different ways; different sections are suitable
for different roles. For more information about which sections are appropriate for particular
roles, see the next section, "How This Guide Is Organized."

How This Guide Is Organized


This guide is designed to provide comprehensive guidance for designing manageable
applications. The guide is divided into five sections. The following table describes each section
and the intended audience for each section.

Section Primary audience Description

"Understanding Manageable Solutions architects Defines manageable applications.


Applications" Infrastructure architects Explains the benefits of manageable
applications.
Defines a high-level process for designing
manageable applications.

"Architecting for Operations" Solutions architects Examines the architectural principles that
Infrastructure architects should be followed to design manageable
applications.
Explains management models, and shows
how these can be defined.

"Developing for Operations" Developers Examines the development tasks that must
be performed to create manageable
applications. Shows how the management
model can be consumed by developers to
make developing manageable applications
easier.

"Managing Operations" Operators Explains how to take manageable


applications and use them in the operations.

"Technical References" Developers Includes technical resources that provide


additional information about the process of
creating manageable applications.

Chapter Outline
This guide includes the following chapters and an appendices:

• "Introduction"
• Section 1, "Understanding Manageable Applications"
◦ Chapter 1, "Understanding Manageable Applications"
◦ Chapter 2, "A High-Level Process for Manageable Applications"
• Section 2, "Architecting for Operations"
◦ Chapter 3, "Architecting Manageable Applications"
◦ Chapter 4, "Creating Effective Management Models"
◦ Chapter 5, "Proven Practices for Application Instrumentation"
◦ Chapter 6, "Specifying Infrastructure Trust Levels"
◦ Chapter 7, "Specifying a Management Model Using the TSMMD Tool"
• Section 3, "Developing for Operations"
◦ Chapter 8, "Creating Reusable Instrumentation Helpers"
◦ Chapter 9, "Event Log Instrumentation"
◦ Chapter 10, "WMI Instrumentation"
◦ Chapter 11, "Windows Eventing 6.0 Instrumentation"
◦ Chapter 12, "Performance Counters Instrumentation"
◦ Chapter 13, "Building Install Packages"
• Section 4, "Managing Operations"
◦ Chapter 14, "Deploying and Operating Manageable Applications"
◦ Chapter 15, "Monitoring Applications"
◦ Chapter 16, "Creating and Using Microsoft Operations Manager 2005
Management Packs"
◦ Chapter 17, "Creating and Using System Center Operations Manager 2007
Management Packs"
• Section 5, "Technical References"
◦ Appendix A, "Building and Deploying Applications Modeled with the TSMMD"
◦ Appendix B, "Walkthrough of the TSMMD Tool"
◦ Appendix C, "Performance Counter Types"

The technical reference section chapters are included in this outline for the sake of
completeness. However, these chapters are scheduled for inclusion in a later revision of the
guide. The plans for the final version of this guide are subject to change, based on feedback
from the community.

Scenarios Discussed in This Guide


The principles discussed in this guide apply to a wide range of applications. However, the
specific guidance may vary according to the particular type of application being used. This guide
specifically considers a number of different types of applications and offers targeted guidance
where appropriate. The types of applications this guide considers include the following:
• Line-of-business (LOB) applications
• Web services
• Smart client applications
• Mobile client applications

Worked Example Used in This Guide


Most chapters in this guide include references to a scenario that helps explain the concepts of
that chapter. The entire guide uses a single worked example, the Northern Electronics shipping
application, instead of using different scenarios for each chapter. This example is refined in each
chapter, according to the requirements of that chapter.

Northern Electronics Shipping Application


Northern Electronics is an electronics maker based in Everett, Washington, with a partly-owned
manufacturing subsidiary based in Nanjing, China.
Product shipping is a core business process for Northern Electronics. However, the company has
had ongoing problems with the product shipping process. Trucks do not always arrive on time.
And even when the trucks do arrive on time, the requirements for each truck are not always
met. Also, the wrong cargo shows up at the loading dock more often than it should. All of these
logistical problems result in much higher overhead costs, and especially, in delay for the
customers expecting on-time arrival.
To improve the situation, the Chief Operations Officer (COO) of Northern Electronics has
approved a plan to overhaul the product shipping process. This plan includes the development
of a new product shipping application.
The product shipping application is critical to the continued success of Northern Electronics.
From his previous experience in other companies, the Chief Information Officer (CIO) of
Northern Electronics is aware that business applications can prove less reliable and more costly
to operate than expected and is looking to avoid those problems with this application. He has
asked the solutions architect to carefully consider operations costs when designing this
application.
The solutions architect has committed to work with others in the organization to address
operations costs as he evaluates how to design this application.

The Dynamic IT Systems Initiative (DSI)


The Dynamic IT Systems Initiative (DSI) is a proposal from Microsoft and its partners to deliver
self-managing dynamic systems. Organizations can use the technologies that form part of DSI to
automate many of the ongoing operations tasks that are currently manually performed. This
results in reduced costs and more time to proactively focus on what is most important to the
organization. Designing manageable applications now represents an important step toward the
goal of providing fully dynamic systems later.

For more details about the DSI initiative, see "Dynamic Systems Initiative" on the Microsoft
Business & Industry Web site at http://www.microsoft.com/business/dsi/default.mspx.

Patterns and Practices


Microsoft patterns & practices are Microsoft recommendations for how to design, develop,
deploy, and operate architecturally sound applications for the Microsoft application platform.
There are four types of patterns & practices guidance:
• Software factories
• Application blocks
• Reference implementations
• Guides

Microsoft patterns & practices contain deep technical guidance and tested source code based
on real-world experience. The technical guidance is created, reviewed, and approved by
Microsoft architects, product teams, consultants, product support engineers, and by Microsoft
partners and customers. The result is a thoroughly engineered and tested set of
recommendations that you can follow with confidence when building your applications.

Feedback and Support


This version of the guide represents preliminary thinking from the Design for Operations (DFO)
team; as such, it is subject to change resulting from feedback. To provide feedback to the DFO
team, please send an e-mail message to netopfbk@microsoft.com.

Acknowledgments
Thanks to the following individuals who assisted in the content development, code
development, test, and documentation experience:
Core Development Team
• William Loeffler, Microsoft Corporation
• Keith Pleas, Keith Pleas and Associates
• Fernando Simonazzi, Clarius Consulting
• Vanesa Cillo, Clarius Consulting
• Peter Clift, Tek Systems
• Alex Homer, Content Master Ltd
• Paul Slater, Wadeware LLC

Test and Edit Team

• Lakshmi Prabha Vijaya Sundiram, Infosys Technologies Ltd


• Sateesh Venkata Surya Nadupalli, Infosys Technologies Ltd
• Eric Blanchet, VMC Consulting Corporation
• Tina Burden McGrayne, TinaTech Inc

Reviewers
• David Aiken, Microsoft Corporation,
• Mary Gray, Microsoft Corporation
• Peter Costatini, Microsoft Corporation
• Marty Hough, Microsoft Corporation
• Kyle Bergum, Microsoft Corporation
• Alex Torone, Microsoft Corporation
• David Trowbridge, Microsoft Corporation
• Tim Sinclair, Microsoft Corporation
• Jeff Levinson, Boeing Corporation
Section 1
Introduction to Manageable
Applications
This section defines manageable applications and explains the benefits to operators, developers,
and architects of manageable applications. It also defines a high level-process for designing,
developing, deploying, and operating manageable applications.
This section should be of use primarily to solutions architects and infrastructure architects.
However, it also provides useful background information to developers and operators.
Chapter 1, "Understanding Manageable Applications"
Chapter 2, "A High-Level Process for Manageable Applications"
Chapter 1
Understanding Manageable
Applications
Hardware and software costs form only a small percentage of the total cost of ownership (TCO)
for enterprise applications. Over time, the costs of managing, maintaining, and supporting those
applications are far more significant.
A large portion of day-to-day running costs is attributable to application failures, performance
degradation, intermittent faults, and operator error. The resultant downtime can severely
impact business processes throughout an organization.
Many of these problems can be mitigated by ensuring that the enterprise applications are
designed to be manageable. As a minimum, a manageable application must meet the following
criteria:
• It is compatible with the target deployment environment.
• It works well with operational tools and processes.
• It provides visibility into the health of the application.
• It is dynamically configurable at run time.

Manageable applications make day-to-day operations a more predictable, efficient process.


However, the benefits of manageable applications are not restricted to the operations team.
With many current applications, when a problem occurs with an existing application, the
operator attempts to diagnose the problem and may solve the problem by either modifying the
configuration of the application or modifying the system at a lower level (for example, by
making changes to the operating system, the hardware, or the network).
If the operator is unable to diagnose or fix an application problem, the operator may have to
report it to the development team so a fix can be produced. One of the main reasons this
happens is because of insufficient or irrelevant instrumentation. If architects and developers
create manageable applications, they can reduce the number of times they are called upon to
fix problems through additional development.
This chapter demonstrates how to understand applications from different perspectives and
describes how knowledge of the operations perspective can lead to applications that are
designed to be manageable.
Application Perspectives
Depending on their relationship to an application, different people in an organization will have a
different perspective about an application. The different perspectives include the following:

• User. The user perspective can be thought of as the consumer of the application. From
the user perspective, an application is responsible for meeting user requirements.
Requirements such as security, performance, and availability are typically defined in a
service-level agreement (SLA).
• Operator. The operator perspective can be thought of as the facilitator of the
application. From the operator perspective, the application must be provided to the
user, according to the requirements of the application SLA. The operator is responsible
for ensuring that the requirements of the user are being met and taking appropriate
action if they are not being met. Appropriate action includes troubleshooting problems,
providing the user with feedback, and providing the developer with feedback that may
lead to further development.
• Developer. The developer perspective can be thought of as the creator of the
application. From the developer perspective, the application must be designed and built
to meet the needs defined by the user. However, when creating manageable
applications, the developer perspective should also capture the needs of the operator
and the tasks the operator must perform.

Each of these perspectives is held by multiple job roles, all of whom should be involved in
developing and consuming a manageable application. For example, the developer perspective
will typically be held by one or more architect roles, along with the application developers. For
more details about the specific job roles involved in a manageable application, see Chapter 3,
"Architecting Manageable Applications."

Operating Business Applications


Before developing manageable applications, it is important to understand the challenges that
operations teams typically face when managing applications.
Operations consists of a series of interrelated tasks, including the following:

• Monitoring applications and services


• Starting and stopping applications and services
• Detecting and resolving failures
• Monitoring performance
• Monitoring security
• Performing change and configuration management
• Protecting data

The operations team is responsible for ensuring day-to-day availability of the application, yet
they are often provided with applications that are difficult to effectively manage. This often
results in a number of problems, including the following:

• An inability to determine the consequences of problems when they occur


• Insufficient run-time configurability of applications
• Poor understanding of interdependencies between the hardware and software
elements that make up a system
• Poorly designed administration tools that do not reflect the way the IT administrator
views the application
• Changes in one part of a system creating significant impact on the overall environment.
The intent of the administrator and the dependencies among the various components
often cannot be determined by looking at how the resources were deployed in the
environment.
• IT administrators providing the only points of integration across different subsystems.
System configuration rules often reside only in someone's head. Typically, there are no
formal records of either the configuration itself or of the changes that have been made
to it.
• Social processes being responsible for achieving coordination of systems.
Administrators have hallway conversations, send e-mail, or write on sticky notes to
remind each other of issues, changes, and so on.

These problems affect the efficiency of the operations team to manage the application and can
ultimately affect the experience of the users consuming the application.
To solve these problems, the work of the operations team needs to be considered throughout
application design, development, test, and deployment. In many cases, this will be an iterative
process. For example, the experience gained from the day-to-day operation of the system
should guide improvements to the application design over time. With manageable applications,
it is generally easier to transfer system knowledge between all phases of the IT life cycle.

Application Dependencies
Figure 1 illustrates a typical three-tiered architecture for an application.

Figure 1
Application three-tier architecture
From an operations perspective, applications always execute on a platform and generally
communicate over a network. Applications are dependent on their own underlying system and
network layers, but they may also communicate with, and be dependent on, other applications
and services.
Figure 2 illustrates the application from the perspective of an operations team.
Figure 2
Applications from an operations perspective
Operators collect information that corresponds to each of these layers, using the information to
ensure that applications continue to run smoothly. Understanding each layer as a separate
entity, and understanding the relationships between the layers, often allows the operations
team to quickly isolate the source of any problem.
For example, if a computer running a SQL database that provides data to an application
becomes unavailable, the functionality of the application could be affected. In this situation, the
operator needs to know several things:

• What has caused the SQL Sever to become unavailable? Typically, this is exposed in the
form of instrumentation at the system tier and network tier. For example, the computer
running SQL Server may have shut down or a network cable may have been removed.
• What are the consequences to the application? Typically, this is exposed in the form of
instrumentation at the application tiers. For example, some functionality of the
application may be lost or performance of the application may be affected.
• What are the consequences to the business operations of the company? Typically, this
can be exposed in the form of instrumentation at the application business logic tier and
may depend on factors outside the application itself. For example, if a business
operation that occurs once a month is affected, and the problem occurs when there are
25 days before the operation occurs again, the problem is less critical than if the
operation must occur every day.

Typically, developers are not concerned with the details of the lower layers. However, an
architect that is designing for operations should at least have a greater awareness of these
details, because issues at a lower level can lead to problems with the health of the application
itself.

Core Principles for Designing Manageable Applications


If you are going to design manageable business applications, you must consider manageability
as an integral part of the initial design of the application; it should not be just an afterthought.
Manageability should also be refined and improved through feedback from the operations team
after you get better insight about how applications behave after deployment. The process of
designing manageable applications is the result of collaboration between multiple parties who
must agree to a number of core principles, including the following:
• Applications will provide comprehensive, configurable instrumentation that is
relevant to the IT team. Instrumentation is a very important tool that helps you
understand how an application functions and whether it is functioning as expected.
Instrumentation can also form the basis for determining the resolution to problems.
• Applications will have a health state that varies according to their ability to perform
operations as expected. A healthy application is an application that is performing as
expected. By setting certain parameters for an application, and measuring whether the
application is functioning within those parameters, you can determine the health of an
application and take corrective measures when an application is unhealthy. For more
information about application health, see Chapter 4, "Creating Effective Management
Models."
• Application development must remain independent of the underlying platform.
Problems with the underlying platform can affect the health of an application (for
example, a DNS issue may prevent an application from functioning as expected), and it
is often necessary to capture these dependencies in tools such as System Center
Operations Manager 2007. However, this should not prevent developers from using the
proven practice of developing applications that are independent of the underlying
platform.
• Applications will be managed according to proven practices. Operations teams
currently use a series of practices to manage applications. These practices are
determined by experience and the capabilities and limitations of the available
management tools. Manageable applications should provide an operations experience
similar to the best examples of current manageable server applications.
• Operations will use existing standard management tools to manage applications.
There are many existing tools available for operating applications, including built-in
Microsoft Management Console (MMC) tools such as Event Viewer and Performance
Logs and Alerts. For more sophisticated operations management, there are tools
available such as System Center Operations Manager 2007. Creating new tooling for
managing applications further increases the operations team's workload, so wherever
effective existing tooling is available, it should be used.
This list of core principles is not comprehensive. In many cases, additional core principles will be
established to cover areas such as task management and configuration management.

Northern Electronics Scenario


The solutions architect of Northern Electronics has suggested a product shipping solution
centered around three Web services:
• ShippingService Web service. This is the supplier's Web service that is used to send and
receive the details of the shipment pickup.
• PickupService Web service. This is the supplier's Web service that is internally used to
be notified of product pickup and to confirm the shipment was picked up.
• TransportService Web service. This is the transport consolidator's Web service that is
used by the supplier to initially order the transport and finally to confirm that the
shipment was picked up.

The TransportService Web service is not directly implemented by Northern Electronics.


However, it still needs to be considered as part of the overall design because it forms part of the
overall functionality of the application.
Figure 3 illustrates the planned application flow between the Web services, databases, and
workstations used in the application.
Figure 3
Application flow for the Northern Electronics shipping application

Operations Challenges
A number of the problems with product shipping faced by Northern Electronics stem from the
existing product shipping application. The operations team for this application face the following
challenges:

• They rely on users to detect and report faults. Sometimes, users cannot provide
sufficient or accurate information; this makes diagnosis and resolution of faults difficult,
costly, and time-consuming.
• They may have to visit the computer to investigate issues. The information they
receive or can extract from the event logs or performance counters may not provide the
appropriate data required to resolve the fault.
• They cannot easily detect some problems early. These problems include impending
failure of a connection to a remote service caused by a failing network connection or
lack of disk space on the server. They are unlikely to monitor performance counters and
event logs continuously and, instead, use them solely as a source of information for
diagnosing faults.
Development Challenges
The solutions architect is committed to making the new product shipping solution a manageable
application. However, he faces several challenges in achieving this goal:
• The development team has no experience in developing manageable applications, and
there is no budget for using external developer resources.
• Northern Electronics is planning to modify the design of its infrastructure, and these
plans are currently not finalized.
• Northern Electronics is planning to migrate early to Windows Vista and Windows Server
2008.

The solutions architect plans to use a management model for the application to help him
overcome these challenges.

Summary
This chapter examined the different perspectives that interact with an application and focused
more closely on the operations perspective, which must be well understood to design
manageable applications. It introduced some core principles that should be followed when
designing manageable applications. It also provided more details about the Northern Electronics
scenario.
Chapter 2
A High-Level Process for Manageable
Applications
The high-level process for manageable applications defines four interconnected stages that
capture the application through design, development, deployment, and operations, as shown in
Figure 1.

Figure 1
High-level process for manageable applications
This chapter describes each stage and demonstrates how the stages are used together in
manageable applications. As illustrated in Figure 1, the stages are the following:
• Design. A management model is used to define how the application will function in
operations. The management model captures, at an abstract level, the entities that
make up the application, the dependencies between them, the deployment model for
the application, and an abstract representation of the health and instrumentation in the
application.
• Develop. A manageable application will include extensive health and instrumentation
artifacts represented in the management model. Information contained in the
management model is used to help determine the specifics of the health and
instrumentation implementation. Instrumentation will include event IDs, performance
counters, categories, and messages. The application may also perform additional health
checks, such as synthetic transactions.
• Deploy. After the application is developed, it must be deployed. The infrastructure
model (defined as part of the management model) for the application affects the
specific environment that the application runs in, which in turn, affects the health and
instrumentation technologies that can be used. For example, an application deployed in
a low trust environment may not be able to log to a Windows Event Log.
• Operate. After the application is deployed, it must be operated on a day-to-day basis.
Typically, the operations team uses management tools to consume the health and
instrumentation information provided by the application in daily operations and makes
necessary changes to application configuration.

The order of these stages is important - adding the appropriate instrumentation to an


application on an as-needed basis at the end of the development process—or, even worse, after
completing testing and deployment—is unlikely to produce a manageable application. However,
in many cases, feedback during the cycle leads to further development of the management
model.

Roles Participating in the High-Level Process


The following four roles are primarily involved in the high-level process:

• Solutions architect. The solutions architect is responsible for defining the application at
the logical level. This involves determining how the application should be structured,
how health can be determined for the application (in an abstract sense), and the
instrumentation that is necessary to make that determination.
• To help define the various manageability requirements of an application, the solutions
architect should create a management model; typically, this is created in collaboration
with the infrastructure architect.
• Developer. The developer is responsible for consuming the model created by the
solutions architect and creating the application, along with appropriate health,
instrumentation, and configuration artifacts, as defined in the model.
• Infrastructure architect. The infrastructure architect is responsible for specifying the
environment in which the application will run. This information may be specified in an
infrastructure model, which may affect decisions made by the solutions architect (for
example, the trust environment into which the application will be deployed). The
infrastructure architect must also ensure that the application can be deployed in the
environment; if it cannot be deployed in the environment, the infrastructure architect
must ensure that the appropriate changes are made to the application or the
environment.
• Operator. The operator is responsible for the ongoing running of the application and
responds to application and system alerts using a variety of operations tools. The
operator may also adjust run-time configuration of the application in response to
certain events.

Figure 2 illustrates how these job roles participate in the high-level process.
Figure 2
High-level process showing job roles
Many additional job roles participate at some point in the life cycle of a manageable application.
The following table lists these roles and the perspectives that they would hold on the
application. For more information about application perspectives, see Chapter 1,
"Understanding Manageable Applications."

Role Perspective Description

User User Uses application.

User Product Manager User Defines user needs and required features of the application.
(User PM) Works with the solutions architect and infrastructure architect
to define service-level agreement (SLA) for application.

Helpdesk User or Responds to user problems. Works with operations to


Operator troubleshoot application problems.
Records information that will assist operations and future
development.

User Education Developer Responsible for content in error messages, events, and Help
files.

Test Developer Provides feedback to developer during development cycles.

In many cases, individuals are responsible for more than one role in a project.

Understanding the Process


To understand how manageable applications are designed, implemented, deployed, and
managed, it is important to look at the process in more detail.

Designing the Manageable Application


The management model forms the starting point for a manageable application. One of the great
challenges of creating manageable applications is determining, at design time, the needs for the
application in daily operations. By investing time in creating an effective management model
early, you can dramatically increase the likelihood that your application is manageable later.

Creating a management model for the application does not prevent you from using an
iterative approach when designing your application—the model should be flexible enough to
be altered as changes occur in later iterations.

Typically, the infrastructure architect and the solutions architect are the main roles involved in
creating a management model. The infrastructure architect provides input about the
environment in which the application will be deployed, which may include factors such as
network connectivity, network zones, and allowed protocols. This information is critical to the
overall design, because it can affect the way instrumentation will be implemented in the
application. For example, if the application is to be deployed in a low-trust environment, it is
typically not possible to write events to an event log. In some cases (for example, for a shrink-
wrapped application), it may not be possible to determine in advance what the deployment
environment will be, so multiple trust levels may have to be supported.
Generally, the solutions architect is responsible for the specifics of the management model. The
management model defines how the application is broken into manageable operational units
(known as managed entities). It also contains abstract information about the application, which
defines how the application is developed, deployed, and, ultimately, how it is managed. This
information includes an instrumentation model, which indicates all the instrumentation points
for the application, and a health model, which indicates the various health states for the
application.
For more information about creating a management model, including information about how to
use the Team System Management Model Designer Power Tool (TSMMD) tool, see Chapter 4,
"Creating Effective Management Models," Chapter 5, "Instrumentation Best Practices," and
Chapter 6, "Specifying Infrastructure Requirements."

Developing the Manageable Application


After an effective management model is created for the application, the application itself needs
to be developed, and the information contained in the model must be incorporated. The
developer is responsible for taking the abstract elements in the model and generating concrete
artifacts in the code. In particular, the developer typically incorporates specific instrumentation,
such as the following:

• Event log events


• WMI events
• Performance counters
• Event traces

The developer may also need to incorporate specific health indicators, which are used to
determine the health of an application, and configurability support, which are used to modify
what instrumentation is used at run time.

Deploying the Manageable Application


After the application is developed, it must be deployed (for simplicity, testing is intentionally
omitted from this process). For an application to be truly manageable, you should have a high
degree of control over the deployment of that application. This allows you to more easily
manage the process of changes to the application. Also, during deployment, specific
configuration settings for the application may be chosen.
Typically, manageable applications should be deployed using a redistributable Microsoft
Windows Installer (.msi) package or SMS package. However, the specifics of the deployment
model are beyond the scope of this guide.

Operating the Manageable Application


After the application is deployed to its target environment, it must be operated. The operator is
responsible for managing the application, using an administrative console, and supporting tools
such as Event Viewer and Performance Logs and Alerts. The operator may also use more
advanced tooling, such as System Center Operations Manager 2007, with the application.
In many cases, information contained in the management model can be consumed at run time
by operations. This may be as simple as the operator using a report generated from the original
model to understand the workings of the application, or it may be a Management Pack
automatically generated from the original management model.
Facilitating the Process – Guidance and Artifacts
The following tools and artifacts can be used to facilitate the process of designing,
implementing, deploying, and managing applications:

• Guidance. This guidance can be used at all stages of the application life cycle. Chapter 4
includes detailed architectural guidance for designing manageable applications.
Chapters 8–15 provide developer and deployment guidance. Chapters 16 and 17
provide detailed guidance for operating manageable applications.
• TSMMD. This tool is integrated with Visual Studio; it supports many of the
requirements involved in developing manageable application. The feature set of
TSMMD includes the following:
◦ Modeling capabilities. You can use the tool to model many of the artifacts
required in a manageable application. TSMMD represents the application as a
series of related managed entities. By defining different properties of the
managed entities, you can create an abstract representation of application
health, instrumentation, and the target infrastructure.
◦ Automated generation of instrumentation code. TSMMD includes recipes for
automatically generating instrumentation code from the information in the
management model. Instrumentation code is generated in the form of
instrumentation helpers, which separate the process of instrumentation from
the application itself. This means that the application can call abstract
instrumentation, and the application developer does not have to worry about
the specifics of the instrumentation technologies being used.
◦ Validation. TSMMD supports two forms of validation. It ensures that the model
is internally consistent and does not contain orphaned elements. It also
validates that defined instrumentation is called from the application. If
instrumentation represented in the management model is not included in the
application code, the tool generates warnings in Visual Studio.
◦ Management Pack Generation. The TSMMD can generate Management Packs
for System Center Operations Manager directly, using the information about the
instrumentation stored in the Management Model.
• MMD. The Management Model Designer (MMD) is a standalone tool that can be used
to create a hierarchy of managed entities and define a health model for the application.
The MMD can also be used to create Management Packs for Microsoft Operations
Manager (MOM) 2005 and System Center Operations Manager 2007.
• Trust levels. In some cases, the developer architect will not know the specifics of the
deployment environment for the application. By specifying multiple trust levels for an
application, the application can support deployment environments and the decision
about which trust level to use can be deferred until run time. For more details, see
Chapter 6, "Specifying Infrastructure Requirements."
• Run-time configuration. At an architectural level, it is usually not possible to be sure
exactly how the application will be used in daily operations. Therefore, the developer
architect should support flexible operations by providing run-time configuration of the
application. Typically, manageable applications need to support run-time configuration
of instrumentation so the operations team can turn on and turn off instrumentation in
real time and modify the granularity level of instrumentation.
• Management Packs. Management Packs provide a predefined, ready-to-run set of
rules, monitoring scripts, and reports that encapsulate the knowledge required to
monitor, manage, and report about a specific service or application. A Management
Pack monitors events that are placed in the application event log, system event log, and
directory service event log by various components of an application or subsystem. The
rules and monitoring scripts also can monitor the overall health of an application or
system and alert you to critical performance issues in several ways:
◦ They can monitor all aspects of the health of that application or system and its
components.
◦ They can monitor the health of vital processes that the application or system
depends on.
◦ They can monitor service availability.
◦ They can collect key performance data.
◦ They can provide comprehensive reports, including reports about service
availability and service health and reports that can you can use for capacity
planning.

Figure 3 illustrates how the guidance and other artifacts provided can be used to facilitate the
process.
Figure 3
The process showing guidance and artifacts

Northern Electronics Scenario


The solutions architect has decided the overall design of the Northern Electronics shipping
application and plans to use a management model to help him ensure that the application is
manageable by the operations team.
The solutions architect has read this guide and has decided to apply its guidance when creating
the Northern Electronics shipping application. He has identified the other key stakeholders
involved in the application and needs input from them to create the management model. The
operator provides information about the manageability requirements for the application, and
the infrastructure architect helps the solutions architect define the likely target deployment
environment.
The four key roles involved in designing, developing, deploying, and operating the Northern
Electronics shipping application are as follows:
• Solutions architect. The solutions architect will use the information obtained from the
operator and the infrastructure architect to create a management model for the
application. The solutions architect has decided in this case to use both the TSMMD and
MMD. These tools will provide the solutions architect with the following functionality:
◦ Modeling. The Northern Electronics shipping application will be represented as
managed entities, which map to the operations view of the application. The
solutions architect will also model health and instrumentation for the
application, using a combination of the TSMMD and MMD tools.
◦ Model validation. The validation feature of the TSMMD tool will help the
solutions architect ensure that he has created an internally consistent
management model for the Northern Electronics application.
• Developer. The senior developer was worried about his lack of experience in creating
manageable applications, but he is now confident that the TSMMD tool will help in
development. He will use the following functionality provided with the tool:
◦ Code generation. Automatic generation of code for instrumentation helps
ensure that the code for the application has no errors and conforms to the
requirements of the solutions architect.
◦ Code validation. This helps the developer ensure that he calls the
instrumentation code from the application.
• Infrastructure architect. The infrastructure in which the Northern Electronics shipping
application is changing, and the infrastructure architect is currently unsure about the
trust level the application should support. Therefore, he has specified a requirement
that the application should run successfully in both a low trust environment and a high
trust environment. The decision about which trust level to use in the application will be
made at deployment time.
• Operator. The senior operator has experience operating business applications at
Northern Electronics; as such, he has some requirements that he has communicated to
the solutions architect. These include the following:
◦ The shipping application must be developed with the operations team in mind.
◦ Events must be relevant to the operations team.
◦ Instrumentation must be configurable at run time.
◦ The application must be manageable from System Center Operations Manager
2007. The MMD tool can be used to generate a Management Pack for the
application that is useable in System Center Operations Manager 2007.

Summary
This chapter examined a high-level process for designing, developing, deploying, and operating
manageable applications. It examined the roles that participate in that process and the
responsibilities that each role holds. It also examined the artifacts that are available to facilitate
the process of designing manageable applications.
Section 2
Architecting for Operations
This section examines the architectural principles that should be followed when designing
manageable applications. It examines management models and looks in detail at modeling
health and instrumentation. It also captures best practices for instrumenting applications, and it
discusses how to instrument applications that may be deployed to different infrastructures.
Lastly, it shows how to use the Team System Management Model Designer Power Tool
(TSMMD) to create a management model for an application.
This section should be of use primarily to solutions architects and infrastructure architects.
Chapter 3, "Architecting Manageable Applications"
Chapter 4, "Creating Effective Management Models"
Chapter 5, "Proven Practices for Application Instrumentation"
Chapter 6, "Specifying Infrastructure Trust Levels"
Chapter 7, "Specifying a Management Model Using the TSMMD Tool"
Chapter 3
Architecting Manageable Applications
There are a number of significant challenges the architect faces when determining how to
design a manageable application. This chapter examines the fundamental design principles that
must be addressed. It then demonstrates a structure of a manageable application. Finally, it
shows how creating a management model for the application can simplify the work of the
architect and other members of the development team.

Designing Manageable Applications


To design manageable applications, architects should adhere to a number of fundamental
design principles, including the following:

• A management model should be defined for the application. A management model


provides a single authoritative source of knowledge about an application. As a
minimum, the management model you define should capture the dependencies
between different parts of the application, the logical flow of the application, the
instrumentation that will be used to support effective operations, and artifacts that will
be used to measure application health. The architect of the application is the individual
most likely to understand how the application operates and, more importantly, fails to
operate. By describing this in a management model, the architect can communicate to
developers what instrumentation is required and communicate to operations how the
application can be managed.
• The application should expose comprehensive relevant instrumentation.
Instrumentation should provide information that is consistent with the operations view
of the application. Coarse-grained instrumentation can be provided to indicate the
health state of the application, and additional fine-grained instrumentation can provide
supporting diagnostic information to help troubleshoot application problems. Where
possible, instrumentation should also reflect the relationship between the application,
the platform, and the underlying hardware; this allows the operator to relate problems
at a lower level with changes in the health of the application.
• The application should be designed so that its health can be accurately determined. A
number of factors contribute to accurately determining the health of an application. A
well instrumented application will provide information that can be evaluated against a
rule set to determine application health. In some cases, it may be necessary to provide
additional health indicators, such as an application heartbeat or support for a
comprehensive health check of a managed entity. However, while these indicators may
be part of the application itself, the entity responsible for measuring application health
should be separated from the application itself.
• The design of the application should support separation of concerns. The solutions
architect, infrastructure architect, and developer will all have a role in designing a
manageable application. Therefore, it is very important for the design of the application
to allow each role to separate their specific concerns from other roles.

• The application instrumentation should be isolated from the rest of application code.
The architect should make informed choices about the instrumentation technologies to
use, and enforce the use of those technologies by isolating the instrumentation code in
an instrumentation helper. In this case, the application developer calls only abstract
instrumentation code and this is mapped to concrete instrumentation technologies. For
more details, see Chapter 5, "Proven Practices for Application Instrumentation."
• The application should be designed with the target environment (or environments) in
mind. Some instrumentation technologies cannot be used in low trust environments
because they require the application to have a higher level of trust than is available.
Abstracting the specifics of instrumentation can allow for increased flexibility in this
area. In cases where the architect knows the nature of the deployment environment
ahead of time, the appropriate concrete instrumentation can be mapped to the
abstract representation of the instrumentation. In other cases, increased flexibility will
be needed, and the decision about the specific instrumentation technology used must
be deferred until the application is deployed.
• The application should provide configuration options useful to the operations team.
The information provided by extensive instrumentation is of use to the operations team
only if they can perform an action based on that information. In some cases, the
operations team will need to restart the application or individual services. In other
cases, it may be possible to make other real-time changes to application configuration
to solve a problem. Instrumentation information that is closely related to configuration
options is more relevant to operations. It should also be possible to configure the
instrumentation options themselves—for example, to increase the amount of
information that is reported when troubleshooting a problem. Where possible,
configuration settings should be constrained to ensure that the operations team does
not create incorrect settings or change the wrong settings.

• Application code should be auto-generated where appropriate. Auto-generating code


according to the requirements defined by the application architect can increase
efficiency by saving time and reducing errors. The architect can use the management
model to define in some detail the instrumentation requirements for the application;
this allows much of the instrumentation code in the application to be auto-generated
from the pre-defined requirements.
• The application design should conform to effective, proven design principles. When
designing a manageable application, architects have to consider additional factors that
affect the design of the application. However, this does not prevent the architect from
designing the application so that it adheres to existing, proven design patterns. You
should always make sure your application is well designed throughout, and considering
key design patterns during the application design process helps ensure this.

Representing Applications as Managed Entities


A managed entity is any logical part of an application that a system administrator needs to
configure, monitor, and create reports about while managing that application or service.
Examples of managed entities are a Web service, a database, an Exchange routing group, an
Active Directory site, a computer, a server role, a network device, a hardware component, or a
subnet.
It is important to understand that administrators will evaluate what to monitor and what actions
to take, based on the importance of an application in meeting the business needs of their
organization. They will not base these decisions about how the software is physically built or any
internal organizational divisions that may have impacted its design. For these reasons, when
defining a managed entity, it is a good practice to use internal architectural design documents as
a starting point, focusing on the logical objects and relationships that operators will understand.
Every managed entity that makes up your application performs a discrete set of operations.
These operations are interesting from a monitoring, dependency, and troubleshooting
perspective, but they are not objects that an administrator would think of as being able to
configure or manipulate like a managed entity. The collection of these subdivisions is referred to
as the aspect of a managed entity.
Typically, managed entities have relationships with other managed entities. The relationships
indicate the dependencies that exist between different managed entities. For example, consider
an application that consists of a number of Web services and databases. Figure 1 illustrates the
dependencies between the different entities that make up the application.
Figure 1
Dependencies between entities

Relationships between managed entities are very important in a management model. These
relationships can directly affect the health, instrumentation, and performance of an entire
system. For example, in Figure 1, the Products Web service is dependent on the Products
database and the Transport Web service. This means that a change in the health state of the
Transport Web service may affect the health of the Products Web service.
The way these relationships are specified depends on the tooling used to represent the
management model. For example, the Management Model Designer (MMD) tool (which focuses
predominantly on health) enforces a parent-child hierarchy between managed entities and uses
the relationship to determine the health of a managed entity. In this case, the health of child
managed entities is rolled up to provide an indication of the health of a parent managed entity.
By contrast, the Team System Management Model Designer Power Tool (TSMMD) tool (which
focuses predominantly on instrumentation) does not use a parent-child relationship.

Advantages of Using Managed Entities


Dividing applications into managed entities brings a number of advantages to an application
architect creating manageable applications, including the following:
• It provides an operations view of the application.

• It ensures that instrumentation is sufficient.


• It provides a close mapping to configuration.
The next sections describe each of these items in more detail.

Providing an Operations View of an Application


Effective operations often rely on a "divide and conquer" approach. By dividing the operations
environment into a series of interdependent managed entities, the operations team can quickly
diagnose the source of any problem and determine the effects of the problem.
This approach has proven effective for the infrastructure on which applications run, but
applications themselves are often viewed as single, irreducible units. If the application itself can
be represented as discrete managed entities, supported with comprehensive instrumentation,
the architect allows the operations team to isolate the particular managed entity that has the
problem. If the developer also allows the service to be configured, the operations team may be
able to fix problems without contacting the developer.
Separating the application into managed entities also allows you to determine and document
additional information that is highly useful to the operations team. For example, managed
entities will often have dependencies on one another and on other external managed entities
(such as a partner Web service). By isolating these dependencies, you can determine the effect
that the failure of a managed entity will have on the functionality of the entire application.
It should also be possible to determine a logical flow for the application, which shows how the
managed entities communicate with one another in the course of normal business operations.
This information will also help the operations determine the effects of a failure in a managed
entity.

Ensuring That Instrumentation Is Sufficient


One of the great challenges of designing manageable applications is providing comprehensive,
relevant instrumentation for the application. That task is made significantly easier in
applications represented as managed entities. The managed entities and relationships between
them should already reflect an operations view of the application, so events associated with
managed entities should be relevant to operations personnel. If events are associated with all
managed entities and with communications between managed entities, those events should
capture all the operations of the application and can be used as a basis for determining the
health state of the application.

Close Mapping to Configuration


Configuration provides a way for the operations team to diagnose and troubleshoot problems,
to improve application performance, to specify an appropriate level of instrumentation at run
time, and to alter configuration to reflect changes in the underlying application environment. As
already discussed, managed entities provide an operations view of an application, so the
developer should ensure that each managed entity has its own configuration settings. This helps
to ensure that the operator is provided with a consistent view of the application, centered on
managed entities, and does not need to understand a different application model for
configuration purposes.

Benefits of Defining a Management Model for the Application


Nothing about the design of a manageable application compels you to create a management
model. However, using a management model as the starting point dramatically simplifies the
process of application design. The management model allows the architect to capture important
information about the application, which can then be used as the basis for further development
of a manageable application. For more information about management models, see Chapter 4,
"Creating Effective Management Models."
The artifacts discussed in this chapter can all be modeled using the TSMMD tool; including the
managed entities, the abstract instrumentation, the mappings between abstract
instrumentation and concrete instrumentation. The tool can then be used to generate
instrumentation code from the model. The standalone MMD tool can also be used to define
aspects of a management model. For more information about using each of these tools, see
Chapter 7, "Specifying a Management Model Using the TSMMD Tool."

Designing, Developing, Deploying, and Maintaining Manageable


Applications: Refining the Process
Chapter 2, "A High Level Process for Manageable Applications," outlined a high-level process for
designing, developing, deploying, and maintaining manageable applications. However, this
process can be refined using the additional guidance and tooling introduced in this chapter. The
process can now be summarized as follows:
1. Use the architectural guidance contained in this guide to determine how to design
your application.
2. Use the TSMMD tool to create an operations view of the application and to model
health and instrumentation artifacts for the application.
3. Generate instrumentation code for the application from the model.
4. Call abstract events from the application code.
5. Build the application, including the instrumentation helper.
6. Test the application.
7. Deploy the application.
8. Manage the application.
More information about the stages of this process can be found throughout the rest of this
guide. For a detailed walkthrough of using the TSMMD tool, see Chapter 7, "Specifying a
Management Model Using the TSMMD Tool."

Northern Electronics Scenario


The solutions architect of Northern Electronics has defined the following managed entities for
the application:
• One managed entity for each Web service in the application:

◦ ShippingService
◦ PickupService
◦ TransportService
• One managed entity for each database used by the application:
◦ Transport

◦ Shipping
• One or more managed entities for each workstation application that communicates
with a Web service:
◦ WarehouseClient. This corresponds to the application running on the
warehouse workstation.

◦ PickupConfirmationClient. This corresponds to the application running on the


loading dock workstation.
◦ OrderClient. This corresponds to the application running on the shipping clerk
workstation.

◦ PickupNotificationClient and ProcessOrderClient. These correspond to the


application running on the transport office workstation.

The application running on the Transport Office workstation has two distinct pieces of
functionality of concern to the operations team. The solutions architect has decided to reflect
this by representing the application as two separate managed entities.
The solutions architect plans to use these managed entities as the basis for an application
management model. As a minimum, he plans to define abstract events and measures for each
managed entity, along with default instrumentation levels for each abstract event, and
mappings to concrete instrumentation technologies. He will also define trust levels for the
application and health states for the application.
Summary
This chapter examined the overall design of a manageable application and discussed the design
principles that should be adhered to when architecting manageable applications. It also used
these principles to refine the high-level process previously discussed in Chapter 2, "A High-Level
Process for Manageable Applications," and provided additional information about the Northern
Electronics Scenario.
Chapter 4
Creating Effective Management
Models
Creating a management model is a key part of designing manageable applications.
Comprehensive management models provide an abstract representation of all knowledge about
the application; they do this by capturing information that is relevant to the successful
management of the application. Management models ensure that manageability is built into
every service and application; they also ensure that management features are aligned with the
needs of the administrator who will be running the application. As a result, they can
dramatically simplify the deployment and maintenance of applications in a distributed IT
environment.
Information contained in a comprehensive management model for an application has a number
of uses for the operations team, including the following:
• It provides operations with a broader view of the applications they need to maintain by
encapsulating all the information about an application in a coherent organized manner.
• It provides an abstraction of day-to-day operations from low-level technologies. For
example, if a database that forms part of a business application fails, the operations
team will often have to examine low-level events in a SQL log to determine the cause of
a problem. However, if the management model encapsulates the functionality of the
application, a management tool can be used to diagnose and correct the problem.
• It demonstrates how the various technologies that form a solution relate to one
another in operations.
• It predicts the impact of proposed changes to the environment.
• It provides effective troubleshooting information and a detailed view of issues,
including the impact of any problem.
• It provides well-defined, prescriptive configurations for deployment.
• It automates operations with pre-defined command line tools and scripting.

The output from a management model can form the basis for the definition of many artifacts
required during development, including instrumentation and health artifacts. This ultimately
leads to well-designed application instrumentation that supports full monitoring, diagnosis and
troubleshooting by IT operations staff. Effective management models can also reduce the time
needed to adopt a new application, because operations staff will have a more thorough
understanding of the application architecture.
Management models should represent the application as comprehensively as possible.
However, even a partial management model can be very useful in creating a manageable
application. This chapter discusses the elements that make up a comprehensive management
model, and then it discusses in more detail two of the key areas that the rest of this guide will
focus on: instrumentation and health.

Benefits of Using Management Models


Creating comprehensive management models provides a total system view and provides many
benefits, including the following:
• All interrelated software and hardware components managed by the administrator can
be captured in a single source.
• Prescriptive configurations and best practices can be captured in a single knowledge
base; this allows changes to the system to be tested before the changes are
implemented.
• The infrastructure that holds the system model captures and tracks the configuration
state, so administrators do not have to maintain it in their heads.
• Administrators do not have to operate directly on the real-world systems; instead, they
can model changes before committing them. This allows "what if" questions to be tried
out without impacting the business.
• Knowledge of the total system view can improve over time. When the system is
developed, basic rules and configurations are defined. As the system is deployed, the
details of the configuration and environmental constraints or requirements are added.
As operational best practices are developed or enhanced, they can also affect the
model.
• The management model becomes the point of coordination and consistency across
administrators who have separate but interdependent responsibilities.

Management Model Views


Typically, management models are consumed in different ways, so a comprehensive
management model must capture the elements of a system from a number of different views.
The management model should encapsulate the following two common views:

• Layered view. Applications have dependencies on many other layers of technology,


including databases, operating systems, and hardware. These layers should all be
captured in a comprehensive management model, so the impact of changes in a
particular layer can be understood.
• Administrative view. Typically, administrative responsibilities are split between
different administrative roles. It is important for the management model to capture
these roles, so a problem can be assigned to the appropriate team for resolution.
Administrative responsibilities could include the client computer desktop, network,
Active Directory, database, and application.

After a comprehensive management model is in place, management of the complete system can
be performed through the model.
Comprehensive Management Models
Creating a comprehensive management model consists of modeling in a variety of different
areas to provide a total system view, including the following:

• Configuration modeling. This involves encapsulating all the settings that control the
behavior or functionality of an application or system component.
• Task modeling. This involves cataloging the complete list of tasks that administrators
have to perform to administer and manage a software system or application.
• Instrumentation modeling. This involves capturing the instrumentation used to record
the operations of a system or application. Instrumentation provides information to the
operations team to increase understanding about how the application functions, and to
diagnose problems with an application.
• Health modeling. This involves defining what it means for a system or application to be
healthy (operating normally) or unhealthy (operating in a degraded condition or not
working at all). A health model represents logically the parts of an application or service
the operations team is responsible for keeping operational.
• Performance modeling. This involves capturing the expected baseline performance of
an application. Performance counters can then be used to report and expose
performance on an ongoing basis, and a monitoring tool can compare this performance
to the expected performance.

The next sections describe each of these tasks in more detail.

Configuration Modeling
In a corporate setting, system administrators frequently have to configure thousands of client
computers and hundreds of servers in their organizations. Standardizing and locking down
configurations for client computers and servers helps simplify this complexity. Recent studies on
total cost of ownership (TCO) identify loss of productivity at the desktop as one of the largest
costs for corporations. Lost productivity is frequently attributed to user errors, such as
modifying system configuration that renders their applications unworkable, or to complexity,
caused by non-essential applications and features on the desktop. Configuration modeling
attempts to address this problem by capturing all the settings that control the behavior or
functionality of an application or system component.
Configuration modeling addresses only those settings that are controllable by an administrator
or an agent. Typically, a configuration model captures the valid configuration settings for client
computers and users, and also for member servers and domain controllers in an Active Directory
forest.
In many cases, configurations will be standardized and centrally managed using technologies
such as Group Policy or Systems Management Server (SMS).
For an application to be managed using Group Policy, that application must have built-in
support for Group Policy. Enterprise Library applications can be managed by Group Policy,
using Enterprise Library.

Task Modeling
Administrators typically must learn to use multiple tools to achieve a single administrative task.
Task modeling helps address this problem by enumerating the activities that are performed
when managing a system as defined tasks. These may be maintenance tasks, such as backup,
event-driven tasks, such as adding a user, or diagnostic tasks performed to correct system
failures. Defining these tasks guides the development of administration tools and interfaces and
becomes the basis for automation. The task model can also drive self-correcting systems when
used in conjunction with instrumentation and health models.
Task-based administration uses tasks to administer systems. Task models describe
administration of the component or application in terms of tasks. Tasks are defined as complete
actions that accomplish a goal that has a direct value to the administrator. They enable task-
based administration; this makes it easier to define, enforce, and delegate responsibilities to
different system administrators. In the future, task models will provide a foundation for role-
based access control.
Building all command-line and GUI administration tools based on the same task model can
dramatically lower the time and effort required to learn how to manage Windows operating
systems, server applications, and client applications and it enables automating system
administration tasks.
The following are the most important benefits of building a task-based administration model:
• Administrative tasks can more closely reflect the operations experience. The
administration of applications is described in terms of tasks that are understandable by
system administrators instead of simply reflecting the way in which the application was
developed.
• User experiences are consistent with the administrative tools. Administrative tools
may be GUI-based snap-ins, command line-based utilities, or scripts (for example,
Powershell scripts). Consistency between all these administrative tools allows
administrators to start working with the system using easy-to-understand GUI tools,
and then directly use this knowledge to manage applications with command-line tools
and build automated management scripts.
• Role-based administration is easier to implement. Task models can be the foundation
for implementing role-based administration for your application. Role-based
administration allows you to simplify the access control list (ACL) complexity that exists
today. Task models provide a simplified method for assigning and grouping
responsibilities and access rights. A user role can then be defined as a collection of
tasks. Being a member of a particular user role simply implies being allowed to perform
a set of tasks.
• System management costs for your software are easier to estimate. Each task in the
task model has an associated cost when performing the task. The cost of executing a
task depends on different factors, such as how frequently the task should be
performed, how long it takes to do it, the skill level of the person who runs it, and so on.
It currently takes a substantial amount of effort to gather these statistics. Capturing this
data in task models allows you to do the following:
◦ Calculate the management cost for your product.
◦ Compare it to the management cost of the previous version or a competitor’s
product.
◦ Show your customers the financial benefits of migrating to the new version.
◦ See what tasks cost your customers the most to perform.

Instrumentation Modeling
Applications often contain minimal instrumentation or instrumentation that is not relevant to
operations. This results in applications that are difficult to manage, because the operations staff
is not provided with the information it needs to manage the application on a daily basis or to
troubleshoot issues as they occur.
Instrumentation modeling helps to ensure that appropriate instrumentation is built into the
application from the beginning. An instrumentation model allows you to discover the
appropriate instrumentation requirements and then implement this instrumentation within the
application.
Benefits of instrumentation modeling include the following:
• It makes the task of developing the instrumented application more straightforward
for the application developer. The application architect can create the instrumentation
model in abstract form in advance of the development process, clearly defining the
nature of instrumentation required in the application.
• It provides relevant feedback about the application to the operations staff. Well-
designed instrumentation will correlate closely to the operations view of the application
and assist in daily operations tasks. In other words, it will correspond directly to the
configuration and task models. At a deeper level, instrumentation will provide
diagnostic information that the operations team can use to troubleshoot applications
problems.
• It provides feedback about the application to the application developers.
Instrumentation can also provide information to a developer that is directly relevant to
the design of the application. This makes application testing easier, and it reduces the
costs of future development cycles. This type of administration is generally hidden from
operations.

Health Modeling
Health modeling defines what it means for a managed entity to be healthy or unhealthy. Good
information about the health state of an application or system is necessary for maintaining,
diagnosing, and recovering from errors in applications and operating systems deployed in
production environments.
Health modeling uses instrumentation as the basis on which monitoring and automated
recovery is built. Frequently, information is supplied in a way that has meaning for developers
but does not reflect the user experience of the administrator who manages, monitors, and
repairs the application or system day to day. Health models allow you to define both what kinds
of information should be provided and how the administrator and the application or system
should respond.
When customers are evaluating a new application, they expect to receive important information
about its capabilities, along with deployment and setup instructions. However, they frequently
are never given the guidance or tools to operate that software on a daily basis after it is
deployed.
Providing the correct view of an application, what it looks like when it is functionally normally
and when it is not functioning normally, and providing the correct knowledge to troubleshoot
issues to IT and operations customers allows them to meet their service level agreements (SLAs)
to their own customers. Troubleshooting guidance and automated monitoring capabilities
delivered to customers when an application is released will substantially improve the adoption
and deployment rates for any new or updated application. Customers will be more comfortable
and confident in deploying new technology when they can monitor how it is performing in
production and know how to get out of trouble quickly when something goes wrong.
Most problems that impact the service delivery of an application could be fixed before the
problem is visible to end users. Effective health modeling ensures that the operations team
thoroughly understands what affects the health of their system, so problems can be detected
before service is impacted and troubleshooting and resolution can be automated as much as
possible. When a problem is detected, the management model facilitates a thorough diagnosis
and a proper solution. Health modeling also enables the operator to take preventive-care
measures before problems occur to maximize system up-time.

Performance Modeling
Performance modeling is used to capture the expected performance of a system, defining a
baseline that can be measured against in the future. Performance modeling is closely related to
instrumentation modeling (performance counters are a form of instrumentation) and health
modeling (an application that is performing poorly compared to a pre-determined baseline is
typically considered to be unhealthy).
Performance modeling is useful in capacity planning because it can be used to help determine
expected performance when a system is put under stress or when the configuration of a system
is changed in some way.
A monitoring tool is normally used to measure an application against the performance
information in a management model. When the monitoring tool detects that the application is
not responding or is failing to meet the expected performance level, it can raise an alert to the
operations staff and send an e-mail message. Operators can check the performance and event
logs to get diagnostic information about the problem that will help them recover the application
in the shortest possible time.

Modeling Instrumentation and Health


Ideally, a management model should encapsulate all knowledge of a system. However, in cases
where this is not possible, as a minimum the management model should include health and
instrumentation information. Health and instrumentation are intrinsically linked together,
because instrumentation is important to determining the health of any system. Therefore, this
section starts with how to define an effective instrumentation model, and then it moves on to
discuss health models.

Effective Instrumentation Modeling


Instrumentation is responsible for exposing the internal state of an application or system.
Instrumentation, along with additional health indicators, can be used to reveal the health of an
application by capturing a transition between a healthy and an unhealthy state.
However, not all instrumentation is directly related to health. One service sending a service
request to another service may be instrumented, but it does not indicate the health state of the
application. Instrumentation can reveal more detailed information that may be used for a
number of purposes, including the following:

• It can provide performance information for the application.


• It can demonstrate usage trends for an application (for example, to support capacity
planning).
• It can show whether service-level agreements (SLAs) have been met.
• It can provide a basis for usage charges.
• It can provide information that can be used to troubleshoot application problems.
• It can reveal security breaches.

Your management models should capture the abstract instrumentation requirements for your
application. The developer can then use these requirements to create the corresponding
instrumentation artifacts.

Types of Instrumentation
Typically, instrumentation takes one of two forms in an application:

• Performance counters
• Events

When determining how to support manageability, you should consider how operations will
consume the instrumentation you create. Instrumentation created by the developer may be
consumed in a relatively raw form by the operator—for example, by examining event logs or by
using a low level tool or script to examine Windows Management Instrumentation (WMI)
events. However, particularly in larger organizations, the operator may have access to a tool
such as Microsoft Operations Manager (MOM), which allows him or her to see the information
in a more structured way, and can automate many of the processes of effective operations, such
as creating rule sets and issuing alerts.

Performance Counters
Performance counters provide continuous metrics for specific processes or situations within the
system. For example, a performance counter may indicate the current processor usage as a
percentage of its maximum capacity or the percentage of memory available. The metric can also
be an absolute value instead of a percentage, such as the number of current connections to a
database, or the number of queued requests for a Web server.
The operating system and the default services, such as Internet Information Server (IIS) and the
Common Language Runtime (CLR), expose built-in performance counters. In general, you should
aim to use these where possible, complementing them with custom performance counters only
where necessary. For example, your management model should specify use of the built-in IIS
Request Execution Time counter if this can provide the information required by the
management model. In this case, adding an equivalent custom counter will simply add to the
load on the server; it will not achieve anything extra.

Built-in counters cover a wide range of processes in IIS, ASP.NET, the CLR, and SQL Server. For
a complete list of these counters, see "Windows Server 2003 Performance Counters
Reference" on Microsoft TechNet at
http://technet2.microsoft.com/WindowsServer/en/library/3fb01419-b1ab-4f52-a9f8-
09d5ebeb9ef21033.mspx.

Events
Monitoring tools can read the event logs of each server in a distributed application and use this
information to raise alerts and send e-mail messages to specified groups of operators when
problems occur. They can also indicate recovery from a problem, which allows operators to
verify that resolution of a problem was successful. Events may take many forms, including
Windows Event Log events, WMI events, and trace statement file entries.
You should consider specifying events for all possible state transitions; operators can filter those
that are of interest. To allow filtering to take place in the monitoring environment, events must
specify a severity and a category in addition to the description and, where possible, recovery
information. Events can also specify security levels; in this case, filtering can take place based on
an operator's security status.
You can use events to indicate non-error conditions if this is appropriate for your application, or
if it is necessary to indicate state changes. To indicate a state change, you can arrange for a
service to raise an event when it starts and again when it completes processing of each Web
service request. An event handler can then be used to determine the average number of
requests in a particular period, the average request time, and the total number of requests. If
these values reach some pre-defined threshold, the event handler then raises another event.
In this case, you are using events to implement a counter, and then monitoring the counter
within your code. However, the overall result is that, in line with the principles of health
monitoring, your application raises an event to the monitoring system that indicates a state
transition. Your management model will simply indicate that the specified process can undergo
a state transition and the parameters that indicate when this state transition takes place.

Determining What to Instrument


Determining what to instrument in an application is a critical factor in application design.
Specifying and applying the wrong types of instrumentation, at the wrong places and in
inappropriate numbers, may provide a wealth of information but can seriously affect application
performance. The alternative, specifying too few indicators or indicators of the wrong type, can
result in state changes occurring without operators being aware of them.
Instrumentation information that is relevant to operators includes the following:
• Instrumentation directly related to actions the operator can perform to fix a problem.
This type of instrumentation is normally related to the configuration settings within an
application, or reveals a dependency causing the problem.
• Supporting information that helps the operator to diagnose a problem. For example,
performance counter information can help an operator see that a service is being
underutilized.
• Instrumentation at multiple levels that allows application issues to be related to
problems in underlying platform or hardware. In many cases, application problems can
be caused by lower-level issues. Instrumenting at multiple levels helps the operations
team to determine that this is happening.
• Information that allows the operator to determine the urgency of a task. An operator
may have 20 tasks to perform, and operations will be more efficient if priority is given
to urgent tasks.

In general, instrumentation should be comprehensive but relevant. One approach is to


instrument everything that could possibly change state in some way. However, this has a
number of disadvantages:
• It requires increased development time.
• Much of the instrumentation is irrelevant to operators.
• It can negatively impact performance of an application.
• It requires additional resources from the operations team required to determine which
instrumentation is relevant (although tooling such as System Center can help in filtering
events).

Another approach is to focus instrumentation on elements that are relevant to operations. If the
application is structured according to the principles outlined earlier in this chapter, the services
that make up the application will correspond to those defined in the management model; they
will also correspond to the units of operation that can be seen and, in many cases, configured by
operations. This approach offers a number of advantages:
• It provides information directly relevant to operations, which can be acted upon.
• It requires less development time (although potentially more initial time in determining
the services).

The difficulty of the second approach is that it requires the developer to map instrumentation to
the operator's view of the application. In some cases, it may not be possible to determine
exactly what instrumentation will be most useful at run time, although involving the
infrastructure architect and administrators in the creation of the management model should
help.
The recommended approach when determining what to instrument is to perform extensive
instrumentation of all elements that could be of use to operations, and provide run-time
configurability to allow operators fine-grained control over instrumentation.

Granularity of Instrumentation
After you decide what to instrument, you need to determine the appropriate level of granularity
for instrumentation. Normally, the appropriate level of instrumentation will depend on the
current health state of an application. When an application is functioning normally, the operator
will require minimal detail to indicate successful operations. However, if the application has a
problem, or is about to have a problem, more detailed and granular information is useful.
To support this, you should consider instrumenting the application at a fine-grained level but
allowing the operator to configure the level of instrumentation that is exposed at application
runtime.

Performance Considerations
When determining how to instrument your application, you should bear in mind that monitoring
performance counters and raising events absorbs resources from the system being monitored.
As a general rule, you should ensure that monitoring does not consume more than 10 percent of
the available resources on the host.

Building Effective Health Models


An application is considered healthy if it is operating within a series of defined parameters. A
number of factors may result in a change in application health, including the following:
• Change in application configuration
• An application update
• A change in an external dependency
• A hardware change
• A network change
• Bad input to the application
• Scalability problems
• Operator error
• Change in deployment
• Malicious attack
You should always design your applications and services in a way that maximizes performance
and minimizes resource usage. Part of the definition of a healthy application is that is maximizes
performance by making appropriate use of the available infrastructure. It should do the
following:
• Release memory and resources as soon as possible.
• Minimize its footprint on the operating system and available hardware.
• Make best use of the environment and other systems and features.
• Not adversely affect other applications and services.

Health States
The overall health of an application or system is determined by the health of the managed
entities that make up the application. A managed entity is typically considered to be in any one
of three health states:
• RED. This corresponds to a failed state.
• YELLOW. This corresponds to a less than fully operational state.
• GREEN. This corresponds to normal operation within expected performance
boundaries.

In some cases, it is considered beneficial to differentiate between a failed state and an offline
state. In this case, the failed state is represented by RED and the offline state is represented by
BLACK.

Information about the health state of managed entities can be manually gathered by operators
or by management tools that allow operators to do the following:

• Detect a problem.
• Verify that the problem still exists.
• Diagnose the cause(s) of the problem.
• Resolve the problem.
• Verify that the problem was resolved.

Designing a health model entails the following:

• Build the correct application structure, which is made up of components derived from
appropriately predefined components (base classes as defined in the Common Model
Library [CML]) and the relationships between them.
• Build a hierarchy of managed entities that represent the logical services and objects the
application exposes—in a way IT professionals can understand.
• Identify the functional aspects for each managed entity that are of interest for
monitoring. For more information about aspects, see the definition of a managed entity
in Chapter 3 of this guide.
• Identify all the health states that are possible for the application.
• Identify the verification steps that need to be taken to confirm or refute whether an
aspect is in a particular health state.
• Provide the instrumentation required to detect each health state.
• Identify the diagnostic steps needed to determine the root causes for each aspect's
health state.
• Identify the recovery steps that need to be taken to resolve each root cause and return
an aspect and its parent managed entity to full health.

Health State Hierarchies


In a health model, managed entities are arranged in a hierarchical form. This allows the health
state of a child managed entity to affect the health state of a parent managed entity. Aspects
can also be collected together into aggregate aspects. Aspects may also form parent-child
relationships with managed entities.

Managed Entity Hierarchies


The managed entity hierarchy is the starting point for any health model; its structure drives the
definition, connection, and relationships for all the other concepts in the health model.
For example, consider an application that consists of a number of Web services and databases.
Figure 1 illustrates the dependencies between the different entities that make up the
application.

Figure 1
Dependencies in the example application
The following table indicates the health states of the low-level entities illustrated in Figure 1.
Entity State Description and effect

CustomerDatabase GREEN Working normally

YELLOW Degraded, will impact CustomerWebService

RED Failed, CustomerWebService will fail

ProductsDatabase GREEN Working normally

YELLOW Degraded, will impact ProductsWebService

RED Failed, ProductsWebService will fail

TransportWebService GREEN Working normally

YELLOW Degraded, will not directly impact ProductsWebService, but


operators should receive a warning

RED Failed, ProductsWebService will not fail, but operators should


receive a warning

The customer Web service and products Web service both have dependencies on these low-
level entities and a corresponding dependency on the health state, as shown in the following
table.

Entity Dependencies State Description and effect

CustomerWebService CustomerDatabase GREEN Working normally


GREEN

CustomerDatabase YELLOW Degraded, will impact OrderApplication


YELLOW or GREEN and ExtranetWebSite

CustomerDatabase RED, RED Failed, OrderApplication and


YELLOW or GREEN ExtranetWebSite will fail

ProductsWebService ProductsDatabase and GREEN Working normally


TransportWebService
both GREEN

ProductsDatabase YELLOW Degraded, will impact OrderApplication


YELLOW or GREEN; and ExtranetWebSite
TransportWebService
RED, YELLOW or
GREEN

ProductsDatabase RED, RED Failed, OrderApplication and


YELLOW or GREEN; ExtranetWebSite will fail
TransportWebService
RED, YELLOW or
GREEN
The health state of these entities eventually has an effect on the health state of the business
processes that they enable, as shown in the following table.

Entity Dependencies State Description and effect

ExtranetWebSite CustomerWebService GREEN Working normally


GREEN and
ProductsWebService
GREEN

CustomerWebService YELLOW Degraded


YELLOW or GREEN;
ProductsWebService
YELLOW or GREEN

CustomerWebService RED Failed


RED, YELLOW or
GREEN;
ProductsWebService
RED, YELLOW or
GREEN

OrderApplication CustomerWebService GREEN Working normally


GREEN and
ProductsWebService
GREEN

CustomerWebService YELLOW Degraded


YELLOW or GREEN;
ProductsWebService
YELLOW or GREEN

CustomerWebService RED Failed


RED, YELLOW or
GREEN;
ProductsWebService
RED, YELLOW or
GREEN

Aggregate Aspects
Different audiences or consumers of an application or service require different views of the
health of a managed entity. An aggregate aspect provides a higher-level view of a health state
by aggregating health state information from different aspects (and potentially other aggregate
aspects). A common scenario for using aggregate aspects is when you need to represent the
health state for a particular functional area of an application at the managed entity level and a
managed entity has multiple instances (multi-instance managed entity). Figure 2 illustrates a
case where health state can be aggregated at different levels.
Figure 2
Health state of an aggregate aspect
In this case, Web service A is a parent of multiple instances of Web service B (residing on a Web
farm). Web service B has an aspect named connectivity, which corresponds to connectivity to a
database. An administrator wanting to monitor Web service A looks at the connectivity aspect,
which turns yellow if 50 percent of the instances of Web service B have no connectivity and red
if 75 percent of the instances of Web service B have no connectivity.

Rolling Up Aspects into Managed Entities


Ultimately, the two questions you should ask when determining how to roll up aspects into
managed entities are the following:
• What impact does the health state of an aspect have on its managed entity?
• What impact does a managed entity have on its parent?

In some cases, a RED (failure) for one aspect may cause its managed entity, and perhaps even
the entire application, to fail. In other cases, a RED (failure) for one aspect may cause only
degraded performance of one managed entity (YELLOW). There are no definite rules to help you
decide, because each application is unique and the affects of each component and managed
entity will vary. However, there are two general rules:

• If any child of a parent is RED, the parent should be RED or YELLOW. Otherwise,
operators viewing only roll-up indicators will not realize that there is a failure
somewhere in the application.
• If a managed entity is vital for operation of the application, a RED state must cause all
parents and ancestors to be RED indicating at the top level of the monitoring tree that
the application has failed.
Exactly how roll-ups are used in determining health will depend on the technology used. For
example, System Center Operations Manager (SCOM) 2007 uses roll-ups in a different way to
the MMD tool.

Monitoring and Troubleshooting Workflow


When defining and consuming health models, it is important to determine, both in automatable
and human readable form, how to detect, verify, diagnose, and recover from errors. This is the
monitoring and troubleshooting workflow, which defines the logical stages of the monitoring
process and problem recovery process. The stages of this process are illustrated in Figure 3.

Figure 3
Troubleshooting workflow

Detection
A monitoring agent defines how the health states of a particular aspect can be detected.
Typically, there are multiple ways to detect a problem with a managed entity. To detect a
problem, a monitoring agent can do the following:
• Listen for events related to the health of the managed entity.
• Poll and compare performance counters against the specified thresholds as the basis to
detect a problem.
• Scan trace logs for information used to detect a problem.
• Use health indicators, such as heartbeats or synthetic transactions, to determine health.
For more details about the specific health indicators you can use, see Chapter 6,
"Specifying Infrastructure Requirements."

The instrumentation listed within a single detector is linked by the OR logic operator—that is,
the application enters the health state if any of the items is detected. If multiple detectors are
present, they are linked by the AND operator, and all the conditions need to be detected
simultaneously for the application to enter the health state.
It can also be given a NOT flag; in which case, the health state is signaled by the absence of the
detector listed within the monitoring agent within a particular timeframe.
When the defined problem signatures are detected, a problem associated with the operational
condition and health state is indicated. Until the condition is verified, the associated health state
is not updated, and diagnosis and recovery steps should not be attempted.

Verification
After a problem is detected, it is often necessary to verify that it actually still exists. This step is
critical to make sure the problem was not simply a surge in resource consumption, spike in
workload, or simply a transient issue that has gone away. Verification is basic confirmation that
the application is in a particular operational condition without trying to diagnose why or to
recover from it.
The logic that verifies whether an aspect is in a red or yellow operational condition should be
separated into a separate external verifier that will simply return which of the three possible
conditions is in effect at the time. Verifiers should not attempt any kind of diagnosis because
they need to be lightweight and their job is only to confirm whether or not the loss of
functionality (such as "Queue Latency Critical" or "Can't Print") is still observed. Having a verifier
that is built as an external script or executable file will allow the same piece of code to be used
to do the following:

• Confirm whether or not a particular aspect is in an unhealthy health state.


• Verify that recovery actions were successful at resolving a particular problem with an
aspect.
• Perform on-demand detection of the condition of an aspect even in the absence of an
event being logged. Scheduled execution of verifiers can be used as health "pings"
specific to the service even if no user has yet noticed a problem.
• Provide troubleshooting tools that can be used in multiple environments and
applications. In proactive monitoring environments, such as Microsoft Operations
Manager, the verification step is frequently combined with other parts of the
monitoring workflow. This can be done in these environments because there is not
usually a delay between detection and the start of diagnosis where the problem may
have gone away on its own.

Diagnostics
After a negative health state has been detected within an aspect and confirmed to still exist, it
may be necessary to perform diagnosis to determine the root cause of the problem so the
appropriate recovery actions can be taken. Wherever possible, you should try to have
instrumentation that is specific enough to lead directly to resolution, and thereby avoiding this
step. Even if you do not know the exact root cause of a problem, there is usually a good
indication of where to start diagnosis based on the context of how the problem was detected.
In many cases, further analysis is required during diagnosis. For example, it may be known that
there is a network connectivity problem of some kind because of an error code that was
returned to the application. However, until it has been determined that the IP address lease
from the DHCP server was lost, the steps needed to fix it (attempting to renew the lease) are not
clear. Additional trace logs may have to be examined, correlation of information from other
events may have to be done, or even querying the live run-time state may be necessary to
determine the true root cause of a problem.
The diagnostics step uses all forms of available instrumentation, such as events, performance
counters, WMI providers, and activity traces, to correlate information and determine the root
cause. The diagnostics step can take a long time and further disrupt service while it is
happening. It can be necessary to inspect a much broader set of internal state parameters and
correlate between applications to perform the diagnostics step.
The diagnosis step captures the step-by-step instructions of what someone needs to do to
diagnose the root cause of a problem, and may also include script or code to automate this
diagnosis. It can be thought of as a function that takes a general high-level indication of what is
causing a particular aspect of health state as input and returns a specific root cause that can
then be used to take the appropriate recovery steps. The event or performance counter that
leads to the detection of the health state will usually indicate where to start diagnosis for the
problem.

Resolution
After the root cause is identified, the next step is to attempt to resolve the problem. This
process can involve reconfiguration of the application, restarting a service, manipulating internal
state by calling some management API, or performing some other administrative task.
Resolution may also be in the form of a code or a script that will attempt to automate the
resolution steps. It can also reference the GUID of another "blame" managed entity that is
failing to provide the needed services that will become the new starting point for diagnosis.

Re-verification
The same verification procedure that was used to verify the existence of the operational
condition is used to re-verify that the operational condition has indeed been corrected. When
the issue is successfully resolved, this returns false.

Structure of a Health Model


You can create a health model following your own custom structure, and use any tool or
application of your choice (for example, a spreadsheet such as Microsoft Excel); specialist tools
can make the process much easier. They help you to generate a model that conforms to the
accepted formats and schemas. Some tools can also generate Management Packs aimed at the
common monitoring applications, such as Microsoft Operations Manager (MOM). For example,
AVIcode produces a range of tools and integration kits specially designed to meet the Design for
Operations (DFO) vision and work with .NET Framework applications and Microsoft tools. For
more details, see http://www.avicode.com/.
The Management Model Designer (MMD), makes it easy to generate a health model and export
it as a MOM Management Pack. You can also use the Health Model Editor within MOM to create
a model of the instrumentation exposed by an existing assembly, edit this to create a health
model, and then generate the corresponding Management Pack. For more information about
Management Packs in both MOM 2005 and SCOM 2007, see Chapters 15–17 of this guide.
A comprehensive health model can contain state transitions that fall into three categories, each
aimed at a specific area of the application requirements:
• Business Operations health model. This defines the requirements in terms of business
rules and SLAs, and it includes contingency information.
• Application health model. This defines the requirements for the processing and
performance of the application as a whole, its related services and components, and
associated services, such as databases.
• System health model. This defines the requirements for the underlying operating
system processes used by the application.

Figure 4 illustrates the structure and content sections of typical health models for the System,
Application, and Business Operations categories.

Figure 4
The structure and content sections of a typical health model
As shown in Figure 4, each section contains the following:

• Requirement. This includes a series of rule definitions in appropriate terms for the
section in which it resides—for example, a rule in the Business Operations section that
all orders marked as "urgent" should be processed and completed within four hours.
• Detection Information. This includes a series of rules or functions that implement the
detection information. The rules or functions indicate the heath state or condition of
the application (such as "offline" or "failed"), the criticality (RED, YELLOW, or GREEN),
the alerts to send to the operator and monitoring system, and a series of indicators that
define the Health and Diagnostics Workflow sections to which this rule applies. There
may also be a contingency plan that describes workaround procedures while awaiting
rectification of the fault.
• Health and Diagnostics Workflow. This describes the steps to verify the fault, diagnose
the causes, resolve the problem, and re-verify the solution afterward.

Mapping Requirements to Individual Indicators


In some cases, the rules in the health model do not map directly to the set of indicators required
by the application. The following are some examples:

• The health model may define requirements for which measurement is possible only
through using indicators built into the operating system or some underlying application
code, such as one of the Enterprise Library application blocks. In this case, you would
take advantage of the indicators provided by the operating system or underlying
application instead of creating a specific indicator. However, the rule is still part of the
health model for the application you are designing.
• A state transition in specific components may reflect one of a set of different underlying
problems. Correct implementation of the instrumentation will include indicators for
each detectable condition, such as failure to open a connection, failure to update data,
or failure to commit a transaction. Each indicator will return a RED, YELLOW, or GREEN
status, allowing operators to see if the failure is because of, for example, an incorrect
connection string, incorrect permissions within the database, or failure of another of
the series of data access operations.
• Some failures may have more than one cause but only one effect that is detectable
within the application. For example, failure to access a database may be the result of a
network failure, an incorrect connection string, or a database server failure. However,
indicators for this aspect within the application will probably not be able to detect the
actual cause and will just return RED (failed). To diagnose this failure requires other
indicators within the database system, which are not part of the health model for this
application.

A state transition defined in the health model will not always map directly to a single indicator
status change; in some cases, it will reflect the overall results from a combination of settings.

Multiple Distributed Managed Entities


Another common scenario is a controller that manages multiple instances of managed entities.
For example, a Web service may reside on an array of four servers fronted by a router that
distributes requests amongst them based on their current response times. You may be tempted
to think of the Web service as a single managed entity, and implement instrumentation to
measure overall response times, but this may not provide the optimum solution. When the
response times fall below the minimum, and a YELLOW or RED state occurs, all you can
determine is that the Web service farm is at fault. You cannot determine which, if any, server is
at fault, or if the router or a network connection has failed.
Instead, you can think of each Web service as a separate managed entity that you create
instrumentation for, as shown in Figure 5. Rolling up or combining these managed entities
allows administrators to obtain an overall view that is, in some respects, an average of
performance or an indicator of the overall health of the complete application or a subset of
processes. When a YELLOW or RED indication appears, operators can drill down to the next level
and get extra information that allows them to isolate more quickly the source of the problem.

Figure 5
Rolling up managed entities from a Web server farm
The example in Figure 5 illustrates how you can circumvent some of the issues you may
encounter when mapping a health model to the managed entities it describes. To create an
indicator for the overall state of this section of the application in terms of the number of servers
online (if this was a requirement of the health model), you would have to use a probe, such as a
ping request to each server through its IP address, to detect individual server failures.
The alternative of rolling up the individual aspects from the servers through a rule in the health
model makes more sense and reduces the impact on the servers. In the monitoring application,
you would implement the health model rule using the tools that monitor the event logs of the
individual servers and a roll-up rule or combining rule that produces the overall indication based
on the definition in your health model of the minimum number of servers to be online.

Northern Electronics Scenario


The solutions architect has considered creating a fully comprehensive management model for
the Northern Electronics shipping application, but he currently lacks the resources to do so.
However, he is committed to modeling health and instrumentation for this application.
Instrumentation Model
The application will provide comprehensive instrumentation in the form of events and
performance counters. The solutions architect has defined the following abstract events for the
application:
• PickupServiceNETGeneralError
• PickupServiceSOAPError
• ShippingServiceConfigurationError
• ShippingServicePickupFault
• ShippingServiceSOAPError
• ShippingServiceSQLException
• ShippingServiceSuggestedDateMismatch
• ShippingServiceTruckArrivalDelayed
• TransportServiceConfigurationError
• TransportServiceOrderFault
• TransportServiceSOAPError
• TransportServiceSQLException
• TransportServiceUnknownError
• OrderClientConfigurationError
• OrderClientSQLError
• OrderClientUnknownEvent

The solutions architect has also defined the following abstract measures for the application:
• PickupServiceConfirmPickup
• ConfirmShippingService
• ConfirmShippingServiceResponse
• DelayedShippingService
• DelayedShippingServiceResponse
• ShippingRequestPerSecond
• TransportServiceOrderTransport
• TransportServiceOrderTransportResponse

Health Model
The solutions architect has defined the following operational requirements for the application:
• The Transport Order Web service must be available at all times.
• The Transport Order application must be available at all times.
• The Warehouse Management application must have an availability exceeding 90
percent.
• The Shipping Service must be available at all times. However, if the Transport Order
Web service is not available, it will store transport requests until it can pass them to the
Transport Order Web service and must store them for a maximum of two hours.

The solutions architect will use the health model for the application to help determine whether
these requirements are being met. The solutions architect does not consider it necessary to
differentiate between offline and failed heath states, so he defines the following three health
states for each managed entity:
• Green. This indicates the managed entity is working normally.
• Yellow. This indicates the functionality of the managed entity is degraded.
• Red. This indicates the managed entity is either unavailable or offline.

The solutions architect has defined the following aspects for each managed entity:

• Connectivity
• Data access

Other documentation that references Northern Electronics differentiates between a failed and
offline state, so it uses the four health states: Green, Yellow, Red, and Black.

The Transport Order Web service depends on the Transport Order application and the
Warehouse Management application. This means that the health of the Transport Order Web
service is affected by the health of these other managed entities and their corresponding
aspects. Figure 6 illustrates how problems with the Transport Order Web service and
Warehouse Management applications affect the health of the Transport Order application and
the Shipping Service.

Figure 6
Rolling up heath states
In this case, the connectivity aspect of the Transport Order Web service is RED, indicating a
failure. This means that the Transport Order application is also RED because it cannot process
requests, even though its own Data Access aspect is GREEN. The Warehouse Management
application is GREEN because its Data Access aspect has only just transitioned to YELLOW, and
this situation is within the operating parameter defined in the health model (90 percent
availability).
The health model also defines the contingency situation for the Shipping Service in that it will
store requests for the Transport Oder application for a maximum of two hours. Assuming that
this period has not yet passed, the Shipping Service is YELLOW indicating a pending problem but
not a failure.

Summary
This chapter has described how to create effective management models that capture the
knowledge about an application. In cases where all the knowledge cannot be captured, it is still
effective to use a management model to capture health and instrumentation information, which
are critical to designing manageable applications.
Chapter 5
Proven Practices for Application
Instrumentation
Software instrumentation provides information about executing applications. This information
can be used for a number of purposes, such as troubleshooting, capacity planning, business
monitoring, optimizing development, and security auditing. In this guide, instrumentation is
created to support the management of software applications by operations staff, where their
primary concern is the health of the application—for example, the response time for specific
operations, the availability of key resources, or the status of integration points. This chapter
provides a number of proven practices for the architecture and design of software
instrumentation in general, but it focuses on those aspects of instrumentation that assist
operations staff determine application health.

Other proven practices regarding the general principles of architecting manageable


applications are discussed in Chapter 3, "Architecting Manageable Applications."

Events and Metrics


All data collected from instrumentation falls into one of two categories:
• Events. These are raised when specific things happen in a running software application,
or when things fail to happen as expected. Events provide contextual information about
the occurrence, including data such as machine name, process name, user context, and
date/time information. For example, if an application cannot read from a database, an
event could be raised that details the application process, the data connection
parameters, and even the exact syntax of the database query.
• Metrics. These represent measurement of a variable and the units associated with it.
For example, an application might choose to expose the amount of physical storage
being used, the available percentage of a critical resource, such as network capacity, or
the number of orders in a queue. Metrics may be used by themselves, or they may form
the basis of more complex measurements.

Architectural Principles for Effective Instrumentation


If you are responsible for architecting a well-instrumented application, you should adhere to the
following proven practices:
• Create a flexible instrumentation architecture.
• Create instrumentation that operations staff easily understands.
• Support existing operations processes and tools.
• Create applications that are not self-monitoring.
• Support flexible configuration of instrumentation.

The next sections describe each of these principles in more detail.

Create a Flexible Instrumentation Architecture


This guide is primarily concerned with creating instrumentation that supports application health.
However, developers can use the same instrumentation architecture to support capacity
planning, business monitoring, code optimization, and support debugging. You should design
your instrumentation architecture flexibly, so the same principles can be used across all these
areas.
The Team System Management Model Designer Power Tool (TSMMD) tool can be used to
design instrumentation for any purpose. If your management model extends to capture these
different areas, you may want to consider using the TSMMD tool for these purposes.

Create Instrumentation That Operations Staff Easily Understands


As discussed in Chapter 1, "Understanding Manageable Applications," in many cases
applications are designed with little regard to the operations perspective. This can lead to
incomplete instrumentation, but it can also lead instrumentation that is of relevance to only the
development team.
Where possible, the application should generate events and metrics around business entities,
such as customers, orders, and product queries, instead of the technical entities, such as
threads, stacks, and collections of internal objects. When designing the application, start with
basic instrumentation that supports the operational view of the system, and then add more
detailed metrics that provide further insight into your systems and applications. This will
promote a strong implementation that focuses on well-defined and relevant data.

Support Existing Operations Processes and Tools


If you require operations staff to adopt new procedures and use new tools, you may limit their
adoption. For example, if your operations staff uses a lot of scripting, you might consider
generating sample scripts along with your application that exposes the naming and syntax for
your events. Instead of developing a custom tool for operations, it is often preferable to create a
Microsoft Management Console (MMC) snap-in.
Creating MMC snap-ins has been dramatically simplified with the release of MMC 3.0, which
ships with Windows Vista and Windows Server 2008, but it can also be installed on Windows XP
and Windows Server 2003. Another new platform capability to consider is Windows Eventing
6.0. When used on Windows Vista or Windows Server 2008, Windows Eventing 6.0 allows the
operations team to filter events, correlate events across computers, and create and save custom
views.

Create Applications That Are Not Self-Monitoring


It is desirable for applications to detect and recover from unhealthy conditions. However, those
applications cannot always be trusted to reliably monitor themselves. If the health of an
application deteriorates, it may be unable to accurately monitor its performance or even
perform any monitoring at all.
Even in cases where an application is providing reliable information about its own health, users
of the application are likely to perceive self-reported status from an application as biased.
Instead, wherever possible, you should generate instrumentation from the application and use
other external tools to correlate the events and determine the health of the application—for
example, by recording the number of failures in a defined period.

Support Flexible Configuration of Instrumentation


Not all instrumentation needs to be used at all times. For example, very low-level
instrumentation may be required by a developer resolving a bug in an application but may have
no use to an operator. An operator may also require much more information from an
application that is unhealthy than from one that is functioning normally. Different
instrumentation is also required in different deployment environments, and it may be necessary
to choose between these deployment environments at deployment time or at run time. For this
reason, a manageable application should be well instrumented and it should allow that
instrumentation to be configured at design time, at deployment time, and at run time.

Using Instrumentation Levels to Specify Instrumentation Granularity


To support flexible granularity of instrumentation, you should define instrumentation levels for
each abstract event at application design time. The possible instrumentation levels may include
the following:

• Coarse. This level indicates the event is raised during all operations.
• Fine. This level indicates the event is raised during diagnostic and debug operations.
• Debug. This level indicates the event is raised only during debug operations.
• Off. This level indicates the event is not raised at all.

Do not use the granularity level for anything other than verbosity control. Configuration levels
should be inclusive; if you require different behavior, you should use different events.

After levels are defined for each event, an overall instrumentation level can be specified for
each managed entity in the configuration file for that managed entity. Whether a particular
event is raised is dependent on comparing these two values. For example, an architect could
specify that if an event is specified as fine, and the overall instrumentation level for the
managed entity is specified in configuration as coarse, the event will not be raised. However, in
this case, if the overall instrumentation level in the configuration file is changed to fine or
debug, the event will be raised.
The information in the following table shows in more detail how these rules are used.

Instrumentation level (event) Overall instrumentation level Event raised?

Coarse Coarse Yes

Coarse Fine Yes

Coarse Debug Yes

Coarse Off No

Fine Coarse No

Fine Fine Yes

Fine Debug Yes

Fine Off No

Debug Coarse No

Debug Fine No

Debug Debug Yes

Debug Off No

Off Coarse No

Off Fine No

Off Debug No

Off Off No

By specifying an overall instrumentation level in the configuration file for each managed entity,
it is possible to change the instrumentation level at run time in response to a change in
circumstances. For example, if a stop event is raised during coarse grain monitoring, the level of
instrumentation may be changed to fine, so that more information about the application can be
gained.
You may require additional configurability of instrumentation at run time. As a minimum, you
should be able to turn on or turn off instrumentation at run time. However, in many cases, in
addition to granularity control, you will also need to define other settings, such as designating a
remote source for logging.
The default overall instrumentation level will normally be set to Coarse in production
environments to maximize application performance.

Using Infrastructure Trust Levels to Specify Instrumentation


Technologies
Instrumentation may run well in the development environment, but virtually all production
servers run with a reduced set of privileges. Commercial hosting providers may provide even
more limited capabilities than in-house operations. In particular, it may be necessary to
propagate message and trace files using FTP. Without access to platform capabilities, such as
the Windows Event Log, an application may be limited to generating messages or trace files
containing to obtain any production events.
In some cases, specific information about the deployment environment for an application is not
known in advance, or the application must be designed to support multiple target
environments. In these cases, you will need to design the application to support configuration of
instrumentation technologies at deployment time or run time, using trust levels for the different
target environments. For more information about how to achieve this, see Chapter 6,
"Specifying Infrastructure Trust Levels."

Designing Application Instrumentation


You should consider the following proven instrumentation design practices when designing the
instrumentation for an application:

• Use the capabilities of the underlying platform.


• Provide separate instrumentation for each purpose.
• Isolate abstract instrumentation from specific instrumentation technologies.
• Create an extensible instrumentation architecture.
• Use base events for instrumentation.
• Use event names and event IDs consistently.
• Ensure events provide backward compatibility.
• Support logging to remote sources.
• Consider distributed event correlation.

The next sections describe each of these design practices in more detail.

Use the Capabilities of the Underlying Platform


In many cases, the platform will already provide some of the instrumentation required by an
application. For example, adding a performance counter that shows the number of open
database connections may not provide any additional value over the existing ADO.NET counter,
or logging that a Windows service has started may provide no more information than that
provided by the Windows Service Control Manager.
In some cases, your application may need to be supported on earlier versions of the operating
system, and you should consider this when determining which technologies to use. For example,
if you are running an application on Windows Vista, but the application will also run on
Windows XP, you should ensure that you do not exclusively use Windows Eventing 6.0.

Provide Separate Instrumentation for Each Purpose


Instrumentation can have a number of different uses, including determining health,
performance analysis, capacity planning, and developer debugging. Wherever possible, you
should use discrete sets of instrumentation for each purpose. Doing this allows you to configure
the application instrumentation on a case-by-case basis. For example, you may not want to
enable the instrumentation for capacity planning because it may affect the performance of the
application, but you may consider the instrumentation that indicates application health to be
required at all times.

Isolate Abstract Instrumentation from Specific Instrumentation


Technologies
In traditional applications, where instrumentation exists, it is generally coded directly into the
application. This means the application developer must have a good understanding of that
specific form of instrumentation to be able to write the application. If run-time configuration of
instrumentation technologies is required (as suggested previously in this chapter), the
application developer must write code to implement each instrumentation technology and
support run-time configuration to choose between them.
Isolating instrumentation code from application code helps to solve many of these problems. In
this case, the architect describes abstract events, and a mapping between those events and
concrete instrumentation technologies. The developer can then call the abstract events in code,
and an instrumentation helper is responsible for the actual instrumentation using concrete
instrumentation technologies. This simplifies the development of instrumentation and does not
require developers to learn multiple instrumentation technologies. If the instrumentation
helpers are automatically generated, the overall development effort can be streamlined.
In cases where run-time configuration of instrumentation technology is required using trust
levels, isolating instrumentation code also helps. In this case, more than one instrumentation
helper can be deployed (one for each trust level) and the trust level can be chosen through
configuration. This allows the instrumentation technologies used to change without altering the
application itself.
Figure 1 illustrates an application designed to use abstract events for instrumentation.
Figure 1
Using abstract events for instrumentation

Create an Extensible Instrumentation Architecture


Defining an abstract representation of instrumentation is truly useful only if the instrumentation
helper supports the specific instrumentation technology that you use in your environment. By
defining an extensible architecture using pluggable instrumentation providers, you can ensure
that instrumentation helpers can be developed to support other technologies (such as log4net)
when they are required.

Use Base Events for Instrumentation


A number of technologies may be used for collecting events raised by an application, including
Windows Event Logs, Windows Management Instrumentation (WMI), and Windows Eventing
6.0. Events may also be written to trace files, placed in a database, or sent as messages to local
or remote services. The structure and calling syntax may be different between these
technologies, but they consist of mostly common elements. Using a common base class can
minimize the effect on the application of changes in the instrumentation technology. In
addition, a structured approach to building the event will make your events more consistent,
which will help with automation (through scripting or Windows Powershell) and search-aware
tools, such as the Event Viewer that ships with Windows Eventing 6.0.

Use Event Names and Event IDs Consistently


The consistent use of names for events and their parameters enables subsequent filtering of
events to focus on only the relevant events. Windows Eventing 6.0 provides built-in support for
filtering, but a consistent use of event string parameters can enable a script or Windows
Powershell command to process an equivalent subset of recorded events.
Each event ID should uniquely identify an occurrence or failure in your application. You should
not try to use the same ID to report multiple different events, because this can prove confusing
to the operations team.

Ensure Events Provide Backward Compatibility


Many tools rely on particular events to function correctly. If you change the meaning of an event
or delete an event you can cause errors in a tool that monitors for that event. For events used
for purposes other than developer debugging, you should only add new parameters to an event
and make them optional. If you need to remove parameters, or make the parameters required,
you should define new events for this purpose.

Support Logging to Remote Sources


It some cases, it may also be most efficient for an application to asynchronously log events
directly to a remote computer. In other cases, the process of collecting the data must be done
locally, but for performance reasons the processing of the data may be done remotely on
dedicated computers.
If the data must be collected locally, it is often more efficient to compress the events first, and
then send them to a centralized location as a batch.

Windows Vista supports event forwarding, which allows the application to write events locally
and then have the events forwarded to a centralized location automatically.

Consider Distributed Event Correlation


Many applications today are composed of multiple pieces running on different computers. In
these cases, and when logging to remote sources is enabled (as explained in the previous
section) it is important to be able to generate an integrated picture. The use of an application-
generated unique key may be necessary to understand the relationship between distributed
events. In particular, you must include sufficient data—such as network address, cluster ID, and
so on—to allow others to correlate with multiple sources. Windows Eventing 6.0 supports
ActivityID (and RelatedActvityID), which can be used for event correlation.

Developing the Instrumentation


If you are responsible for programming the application events that ultimately raise
instrumentation events, you should consider the following proven instrumentation
development practices:

• Minimize resource consumption.


• Consider the security of the event information.
• Supply appropriate context data.
• Record the times events are generated.
• Provide resolution guidance.

The next sections describe these development practices in more detail.

Minimize Resource Consumption


There is a direct tradeoff between how much information can be collected and the application's
ability to service requests. Wherever possible, instrumentation should take up less than 10
percent of total processing of the application. It is desirable that the instrumentation does not
itself degrade the application, but if critical information is not being logged, the instrumentation
is not sufficient to support its primary goal.
In cases where complex calculations must be performed on instrumentation, these should
normally be deferred to an external process, so that they do not adversely affect the
performance of the application. The calculations may be performed at a later time, or even on a
different computer.
Logging to remote sources, as discussed in the previous section, can also help to improve
performance. Providing instrumentation that is configurable at run time is another practice that
can help improve performance. Run-time configuration allows you to define extensive
instrumentation for your applications and to enable that instrumentation only when required,
ensuring the best available performance in your applications.

Consider the Security of the Event Information


You must be extremely careful if you are logging any sensitive data such as passwords, account
names, SIDs, user data, or events that track end-user activity that impacts privacy. Putting
sensitive information about an exception into an event mechanism on the server may seem
more secure than returning it through the call stack, but you need to understand the security
policies and restrictions of that event mechanism. For example, when applications are hosted on
third-party servers, the reduced trust levels on those servers may result in the instrumentation
being recorded into text file in the application directory.

Supply Appropriate Context Data


Information such as the machine name, user name, and process ID are all contextual
information that could be critical to isolate a problem. Eventing technologies may automatically
supply some of these values, but they may represent the identity servicing the request as
distinct from the identity making the request. When supplying context data, you should also
consider the security of the event information, as mentioned earlier.

Record the Times Events Are Generated


The system time when an event is generated may differ from the time when it is received or
processed, particularly when events are logged to a centralized location. In many cases, the
order in which events are received, and how they match observed behavior in an application, is
critical to troubleshooting problems with an application. Therefore, it is very important that
events include accurate information about the time they are raised. Where possible, each of the
systems raising events should be time synchronized. This is particularly challenging in distributed
systems where systems that are off even by a few milliseconds may make it very hard to
correlate based on time. In those systems, it is sometimes easier to correlate on some
synthetically-generated key that are invariant.

Provide Resolution Guidance


The event string should contain sufficient information to give the operator guidance on how to
proceed in resolving the issue. This may include guidance on what other managed entities may
be affected by a problem, along with contextual information. In some cases, the event will
include a URL that points to centralized documentation or a Help topic ID for a Help file.

Building and Deploying Instrumentation


If you are responsible for building and deploying manageable applications, you should consider
the following instrumentation guidelines.
• Automate implementation of instrumentation implementation.
• Automate the build and deploy process.
• Monitor applications remotely.

The next sections describe these instrumentation guidelines in more detail.

Automate Implementation of Instrumentation


Generating instrumentation code from a model or template will make the structure and calling
syntax more consistent and robust. It will also enable generating meta-information about the
instrumentation, such as the meta-information that might be used by an installer, as discussed
in the next section.

Automate the Build and Deploy Process


For instrumentation, automating the build and deploy process involves creating an installation
package that uses the Installer tool (Installutil.exe) to handle counters and event logs that need
to be registered at install time. Automating this process saves considerable effort and makes the
process more reliable.

Monitor Applications Remotely


The process of collecting, interpreting, and displaying the instrumentation information increases
the load on the application server. In most cases, you should monitor from a remote computer.
Network activity is normally monitored on the local computer, because monitoring the network
remotely can affect performance and therefore lead to unreliable data.

Summary
This chapter has examined proven practices that can be used for instrumentation when
architecting, designing, and deploying applications. You should use this chapter in conjunction
with Chapter 2, "Architecting Manageable Applications," to ensure that you create instrumented
applications that support the overall goal of designing manageable applications.
Chapter 6
Specifying Infrastructure Trust Levels
It is important to understand the infrastructure to which an application is going to be deployed
when the application is first developed. This allows the solutions architect to ensure that the
application will work as expected in the target environment; it also allows the application to
take full advantage of the existing infrastructure.
If different deployment infrastructures are not considered at design time, the application will
not function as expected in some cases; this results in increased costs from change requests at
staging or deployment time. However, it is not always possible to determine the exact nature of
the deployment infrastructure at design time. In many cases, the full details of each datacenter
are not known. Even when the deployment infrastructure is well known at design time, there
may still be a requirement to support multiple infrastructure environments.
In some cases, changes will occur to the target infrastructure during application development.
The deployment environment may also change during deployment or after deployment in a way
that affects the application. If these changes are communicated back to the application
architect, appropriate changes can be made to the application design.
Understanding the target infrastructure is particularly important when designing manageable
applications. The environment into which an application is deployed can have a significant effect
on the instrumentation technologies that can be used in that application. For example, if an
application cannot be installed or run with administrator privileges, that application cannot
write to an event log.
The developer architect must ensure that the application uses instrumentation that will work in
the target environment. In cases where the target environment is not known, the architect will
typically need to support multiple forms of instrumentation, and allow the specific technologies
to be configured at deployment time or run time.

You should deploy applications into the lowest possible level trust environment for proper
execution. The trust environment used leads to design decisions that a developer architect
must make early in the development life cycle.

This chapter describes different infrastructure model scenarios, and then it examines tools that
can be used to create an infrastructure model. The chapter then describes how the Team
System Management Model Designer Power Tool (TSMMD) can be used to capture information
about an infrastructure pertinent to manageable applications.

Infrastructure Model Scenarios


It can be important to model the infrastructure of a target environment in many cases.
However, in creating the guidance included in this book, two main scenarios have been
considered:
• Development of an in-house application
• Development of an ISV or shrink-wrapped application

The next sections describe these two scenarios.

In-House Application Scenario


In this scenario, the solutions architect and infrastructure architect have access to detailed
information about the corporate datacenter into which the application will be deployed. Often,
they will also have gained insight from past deployments that may not have gone well or as
intended. This knowledge can be represented in an infrastructure model that is used by the
developer architect when defining the specifications for the application.
The in-house infrastructure model definition should include named instances of the operational
environments contained in the test datacenter, staging datacenter, and production datacenter.
These names can be descriptive of the actual datacenter, such as Production, or they may
represent a security level within the datacenter, such as Medium. In many cases, it will be a
combination of the two, such as PROD_MED.
Security level specifications may be mapped to a platform specification. For example, the
Microsoft Windows Vista operating system contains well-defined operating system trust levels
named High trust, Medium trust, and Low trust.
Even though the target environment is generally well known in this scenario, it may still be
necessary to support multiple infrastructures. For example, the application may be deployed in
several different data centers with different infrastructures or the infrastructure may differ at
different stages of the application life cycle. When a developer architect is designing a
manageable application in this scenario, he or she must ensure that the instrumentation
technologies used are compatible with the target environment or environments that the model
defines. The developer architect must also allow the application to be configured at deployment
time (and possibly at run time) to support the particular environment used.

ISV or Shrink-Wrap Application Scenario


In this scenario, the deployment environment is unknown, or a wide range of deployment
scenarios must be supported. To address these concerns, the developer architect must design
the application with flexibility in mind, so that the infrastructure architect or the operations
team can configure the application at deployment or run time to support the requirements of
the specific infrastructure used.
In many ways, this scenario can be thought of as the inverse of the preceding scenario. In the
preceding scenario, the deployment infrastructure is well known at the beginning of the
development cycle, and a comprehensive infrastructure model can be used to influence the
development of the application. In this scenario, the deployment infrastructure is not well
known, so the infrastructure model that is used influences the infrastructures that can be
supported. In this case, the more flexible the model, the more environments the application can
successfully be deployed into.
When a solutions architect is designing a manageable application in this scenario, he or she
must ensure that instrumentation is provided for each of the proposed target infrastructures.
He or she must also provide configurability so that those deploying the application can ensure
that the correct instrumentation is used in a particular scenario.

Privilege and Trust Considerations


Privileges are an important consideration when defining trust levels (called target environments
in the Team System Management Model Designer Power Tool). Privileges to perform specific
operations, such as those used by the instrumentation technologies discussed in this guide, may
be allowed or denied at three distinct levels—the operating system, the application host (most
notably IIS), and by .NET Framework security. These levels build on each other, and the
operating system has the ultimate role in denying access to resources. For example, a .NET
Framework Web application may be running under a .NET Framework security policy that allows
writing to the file system, which defers to the IIS file permissions for the site's home directory,
and then defers to any file access or deny permission set in the Windows file system. When
defining instrumentation technologies for a managed entity, the privileges associated with the
run-time process context that will host the managed entity will likely decide what
instrumentation options are available.
Privileges are particularly important as an application moves from a developer's environment
(which typically runs as Administrator with correspondingly high trust), through testing, and into
production (where the security principle of "least privilege" often results in access being denied
to required platform capabilities that were present during development).
Also, the privileges that are required to run the application and emit instrumentation from a
managed entity may be different from the privileges that are required to install the application
and its associated instrumentation. For example, many infrastructure configurations may allow
an application write to an event log or increment a performance counter, but the event log itself
and any named performance counters (and message queues) can only be created by code with
Administrator privileges. For this reason, it is common practice to move such high trust
operations outside the execution of the application and into a separate install process, which is
more commonly run with elevated privileges. For more information, see chapters 9–12 of this
guide.
Security in the .NET Framework is often less well understood by operations staff than platform
and core services security. The principle of least privilege in this context means that while .NET
Framework applications are typically developed with the full trust permission set, they will
normally run in production environments using the LocalIntranet or LowTrust permission sets.
Named permission sets may be specified at the computer level or the "enterprise" level, and
custom permission sets may be created to reflect combinations of permissions that are not
offered by one of the built-in permission sets.
The following table lists built-in permission sets, and their corresponding affects on
instrumentation.

Permission set Affect on Instrumentation

LocalIntranet Writable to only isolated storage, not the file system.

LowTrust Writable to isolated storage, local event log, and Web access.

MedTrust Writeable to isolated storage, local event log, Web access, the file system, and
system registry.

Everything Unrestricted access to performance counters and event log, message queues, and
the service controller, in addition to the file system, system registry, and the Web.

Several of these capabilities have some overlap. For example, creating a performance counter
requires administrator privileges because it essentially writes to a secure part of the system
registry.
Figure 1 illustrates the .NET Framework Configuration tool that can be used to specify the
permission set used for an application.
Figure 4
.NET Framework Configuration tool
The Team System Management Model Designer Power Tool (TSMMD) does not attempt to
enforce selections made when defining trust levels or the instrumentation technologies that are
available within the trust level. This is because each deployment infrastructure is potentially
unique.
This means the TSMMD is flexible in that it may be adapted to a wide range of infrastructure
definitions represented as trust levels. But with this flexibility comes opportunity to make
mistakes when defining available instrumentation technologies for a target deployment
environment.

Tools for Infrastructure Modeling


Software tools for infrastructure modeling can take a number of forms, including the following:
• Standalone tools
• Tools integrated with the development environment

The next sections describe each of these types of tools in more detail.
Standalone Tools
Standalone tools, such as Microsoft Visio, can be used to create a comprehensive model of each
infrastructure that should be supported by an application. In its simplest form, the model would
consist of a named instance of each infrastructure element, with a number of properties
defined. This model can be modified over time as the infrastructure changes.
In some cases, a standalone tool may be designed to export information directly to a
development environment, allowing the developer architect to directly use the information in
his or her design. In other cases, the architect would simply read the model and use it to make
decisions on the design of the application.

Integrated Tools
Integrated tools, such as the TSMMD are used to make the infrastructure model part of the
overall design of the application. This allows the developer architect to directly model elements
of the infrastructure at design time, and it ensures that the infrastructure considerations are not
missed when the application is developed. The TSMMD also allows the instrumentation
technologies for each infrastructure scenario to be automatically generated. The next section
provides more details about the TSMMD.

Infrastructure Modeling with the TSMMD


The TSMMD allows you to define infrastructure models in the context of creating manageable
applications. This means defining one or more target environments, specifying instrumentation
that will be useable in those environments, and ensuring that the correct instrumentation is
mapped to the correct target environment. The TSMMD does not consider other aspects of
application functionality that may be affected in different target environments. For example,
using a different communication mechanism for a different infrastructure would not form part
of a management model defined in the TSMMD.
The TSMMD uses target environments to represent the different infrastructures that an
application may be deployed into. Target environments can be given any name, but typically
they will have names such as high, medium, or low, to reflect the level of trust in the target
environment. Target environments are defined across all managed entities that make up an
application, and each managed entity can be configured to use one or more of the defined
target environments.
To create a target environment in the management model explorer, use the New Management
Model Wizard or simply right-click the Management Model root and then click Add New Target
Environment to specify the named instance and its properties.
Specifying target environments in the TSMMD tool forces the application architect to consider
the instrumentation options available when defining the abstract instrumentation at design
time. This should lead to predictable instrumentation implementation when developers call the
instrumentation helpers within the solution.
The TSMMD tool can also be used to generate instrumentation code. This simplifies changes to
the infrastructure later in the application life cycle. Any changes can be reflected by simply
updating the model definition and regenerating the instrumentation implementations.

Instrumentation Technologies Supported by the TSMMD


The following instrumentation technologies are provided by the Microsoft Windows operating
system and are supported by the TSMMD:

• Enterprise Library Logging. Enterprise Library, from the Microsoft patterns & practices
division, contains the Logging Application Block that allows developers to perform a
wide range of logging tasks using a standardized and easy-to-use interface. The TSMMD
integrates with the Logging Application Block to allow architects to model and specify
events that the Logging Application Block will handle, and which it will write to the
configured target. By default, the TSMMD configures the Logging Application Block to
write events to the Windows Event Log, but administrators can change the
configuration to send events to any target medium supported by the Logging
Application Block (such as email, database, MSMQ, or text files).
• Windows Event Logging. The Windows Event Log service enables an application to
publish, access, and process events. Events are stored in event logs, which can be
routinely checked by an administrator or monitoring tool to detect occurrences of
problems on a computer. The Windows Event Log SDK allows users to query for events
in an event log, receive event data as events occur (subscribe to events), create event
data and raise events (publish events), and display event data in a readable format
(render events). For more information about Windows Event Logging, see Chapter 9,
"Event Log Instrumentation."
• Windows Eventing 6.0 event s. The Windows Eventing 6.0 service added to Windows
Vista and Windows Server 2008 extends the capabilities of the event logging system
while still providing familiar access to Windows Event Logs. Applications can publish,
access, and process events, and administrators or monitoring tools can use the logs
detect occurrences of problems on a computer. For more information about Windows
Eventing 6.0 Event Logging, see Chapter 11, "Windows Eventing 6.0 Instrumentation."
• Event Trace for Windows. Event Tracing for Windows (ETW) provides application
programmers the ability to start and stop event tracing sessions, instrument an
application to provide trace events, and consume trace events. Trace events contain an
event header and provider-defined data that describes the current state of an
application or operation. You can use the events to debug an application and perform
capacity and performance analysis.
• WMI events. Windows Management Instrumentation (WMI) is the instrumentation
standard used by management applications such as Microsoft Operations Manager
(MOM), Microsoft Application Center, and many third-party management tools. The
Windows operating system is instrumented with WMI, but developers who want their
own products to work with management tools must provide instrumentation in their
own code. WMI in the .NET Framework is built on the original WMI technology and
allows the same development of applications and providers with the advantages of
programming in the .NET Framework. For more information about WMI events, see
Chapter 10, "WMI Instrumentation."
• Performance counters. Windows collects performance data on various system
resources using performance counters. Windows contains a pre-defined set of
performance counters with which you can interact; you can also create additional
performance counters relevant to your application. For more information about how to
programmatically create performance counters and how to read performance counters,
see Chapter 12, "Performance Counter Instrumentation."

Northern Electronics Scenario


The infrastructure architect has provided information to the solutions architect about the
potential deployment environment for the application. As a result, the solutions architect has
determined the following:
• Each workstation application will log events to the event log.
• Each Web service may log events to the event log and generate performance counters.
However, the Web service may also need to run in a lower trust environment, where it
can only log events to a trace file.

To support this scenario, the solutions architect decides to define two target environments:
• Low trust will be used when a managed entity is writing to a trace file. Low trust will be
defined as a target environment instrumentation for all the Web service managed
entities.
• High trust will be used when the managed entity is writing to the event log and creating
performance counters. High trust will be defined as a target environment
instrumentation for both the Web service and the workstation application managed
entities.

Summary
This chapter discussed the use of infrastructure trust levels, which are used to support different
target deployment environments. If an architect can define different infrastructure models that
the application supports, the decision about which instrumentation technologies to use can be
deferred until the application is deployed or run. This helps to ensure that the application will
function as expected in the target environment, without requiring changes to the underlying
application code.
Chapter 7
Specifying a Management Model
Using the TSMMD Tool
Creating a management model for your application can be somewhat challenging. To simplify
the process, this guide includes a tool, known as the Team System Management Model Designer
Power Tool (TSMMD), which allows you to graphically model and operations view of the
application. You can use the tool to apply instrumentation to this model and some basic health
artifacts.
This chapter describes the requirements for the TSMMD, and then it demonstrates how to use it
to create a management model for your application.

Requirements for the TSMMD


To install the TSMMD tool, your computer should meet the following software prerequisites:
• Windows XP, Windows Vista, or Windows Server 2003

• One of the following versions of Visual Studio 2008:


◦ Visual Studio Team System 2008 Architecture Edition
◦ Visual Studio Team System 2008 Database Edition

◦ Visual Studio Team System 2008 Developer Edition


◦ Visual Studio Team System 2008 Test Edition

◦ Visual Studio Team System 2008 Team Suite


◦ Visual Studio Team System 2008 Team Foundation Server

You must install both the C# and C++ languages when you install Visual Studio 2008.
The TSMMD requires C++ to generate instrumentation for Windows Eventing 6.0
events.

• Guidance Automation Extensions (GAX) version 1.4 or later


• Enterprise Library version 4.0

To obtain the Team System Management Model Designer Power Tool, visit the Design For
Operations community Web site at http://www.codeplex.com/dfo/.
Creating a Management Model
The following are the high-level steps to creating a management model with the TSMMD tool:
1. Create a TSMMD file.
2. Graphically model an operations view of the application.
3. Define Target Environments for the application.
4. Define instrumentation for the application.
5. Create health definitions for the application.
6. Validate the model.

The TSMMD Guided Experience


One of the major features of the TSMMD that saves architects and developers time, simplifies
the process of building a management model, and reduces the opportunities for errors, is a
series of Wizards that make up the guided experience for the TSMMD.
The Team System Management Model Designer Power Tool includes the following guided
experience wizards that help you to configure individual parts of a management model:
• New Managed Entity wizard. This wizard helps you to create a new managed entity,
and set its properties.
• New Aspect wizard. This wizard helps you to create a new health state aspect for a
managed entity, and define the abstract instrumentation that implements the health
transition indicators for the new aspect.
• New Event Implementation wizard. This wizard helps you to create implementations of
an abstract event, including specific implementations for each of the instrumentation
technologies you specify in the target environments of the model.
• New Measure Implementation wizard. This wizard helps you to create
implementations of an abstract measure, including specific implementations for each of
the instrumentation technologies you specify in the target environments of the model.
• Discover Instrumentation wizard. This wizard helps you to locate and identify existing
instrumentation in the assemblies of your application, and import the definitions into
the model so that you can associate them with abstract events and measures.

Creating the TSMMD File


The TSMMD file is used to hold the information contained within the model. It can be used to
generate instrumentation artifacts in the code; potentially, it can be used to provide information
for a System Center Operations Manager Management Pack.
The following procedure assumes that you already have a Visual Studio solution already in place.
The model is created at the solution level because it will commonly span more than one project.
To create the .tsmmd file and start using the TSMMD tool
1. Start Visual Studio 2008 Team System Edition, click the File menu, point to New, and
then click Project.
2. In the New Project dialog box, click TSMMD Project in the list of project types, and then
click TSMMD Project in the list of projects. Enter a name and location for the new
project and click OK. This creates a new TSMMD project containing a new management
model named operations.tsmmd. The Management Model Explorer window appears
showing this new empty model, and the blank model designer surface appears in the
main window.

If you cannot see the Management Model Explorer window, click the View menu,
point to Other Windows, and click ManagementModel Explorer.

3. Ensure that the guidance packages for the TSMMD are loaded. To do this, click
Guidance Package Manager on the Visual Studio Tools menu. If the list of recipes in the
Guidance Package Manager dialog box does not contain any entries that apply to Team
System Management Model, follow these steps to enable the recipes:
◦ Click the Enable/Disable Packages button.

◦ Select the two guidance packages named Team System MMD Instrumentation
and Team System MMD Management Pack Generation.

◦ Click OK to return to the Guidance Package Manager dialog box.

◦ Click Close to close the Guidance Package Manager dialog box.

If you do not see the two guidance packages in the list, you may need to reinstall the
TSMMD guidance package.

4. In Management Model Explorer, select the top-level item named Operations. In the
Visual Studio Properties window, enter values for the Description, Knowledgebase,
Name, and Version. If you cannot see the Properties window, press F4.
5. Enter values for the Description, Knowledgebase, Name, and Version in the controls of
the wizard, and then click Next.
6. In Management Model Explorer, expand the Target Environments node and select the
target environment named Default. Change the values of properties to indicate
instrumentation technologies you want to use in the default target environment.
7. Right-click on the top-level model entry in Management Model Explorer and click Add
New Target Environment if you want o add more target environments, setting the
appropriate instrumentation technology check boxes for each one.

You use the properties of a target environment to specify that you require any
combination of Enterprise Library Logging events, Windows Event Log events, trace file
events, Windows Eventing 6.0 events, Windows Management Instrumentation (WMI)
events, and Windows performance counters for that target environment. You can also
add more than one target environment to a model to describe different deployment
scenarios.

8. On the File menu, click Save All to save the entire solution.

Graphically Modeling an Operations View of the Application


The TSMMD designer allows you to create a graphical operations view of the application. This
model can then be added to, both by the TSMMD tool and by other designers. The TSMMD
designer allows you to graphically model the following artifacts:
• Executable Application managed entities

• Windows Service managed entities (including data services such as databases)


• ASP.NET Application managed entities

• ASP.NET Web Service managed entities


• Windows Communication Foundation (WCF) services
• External Managed Entities

• Connections between managed entities

The following procedures detail how to model these artifacts in the designer.
To create managed entities using the Wizard:
1. Right-click the designer surface of the management model diagram or right-click the
top-level item in Management Model Explorer, and then click New Managed Entity
Wizard.
2. Enter the required information into the pages of the wizard. The wizard allows you to
specify the name, type, description, discovery type and target, and enable model
extenders for the new managed entity.

To create managed entities using the Toolbox:


1. Open the Visual Studio Toolbox, and then drag one of the managed entity types onto
the designer surface. You can choose Executable Application, Windows Service,
ASP.NET Application, or ASP.NET Web Service as the managed entity type.
2. In the designer, select the managed entity and modify the properties that specify the
discovery method (for use in a management pack), the description, the name, and the
type of the entity. If you cannot see the Properties window, press F4 or right-click the
managed entity, and then click Properties.
3. In the Properties window, modify any extended properties specific to the type of the
selected managed entity. For example, if you added an ASP.NET Application or ASP.NET
Web Service to the designer surface, you can specify the exception and performance
counter thresholds, sample times, and warning levels.
4. Repeat steps 1 through 3 for all the other managed entities in the application.

Each managed entity must have a name that is unique from other managed entities and
external managed entities in the management model. Validation code checks for this and
prompts you with a dialog box if two entities are identically named.

For a local managed entity (an entity from the list above that is part of the application, but
excluding the External Managed Entity), you can specify values for the properties shown in the
following table.

Managed Entity property Description

Description This property contains the description of the entity.

Discovery Target This property, in conjunction with the Discovery Type, defines the way that the
monitoring system will locate the entity to check if it exists on a monitored server. In
other words, whether this part of the application is deployed on that server.

Discovery Type This property defines where the monitoring system should look for the Discovery
Target value. Depending on the type of entity, the options are FilePath,
RegistryValue, ServiceName, and IISApplicationName.

Name This property contains the name of the entity.

Executable Application
The Executable Application entity represents a Windows Forms application, a console
application, or any other type of application that is not a Windows Service or an ASP.NET based
application or service. You can specify a Discovery Type of either FilePath or RegistryValue for
this type of managed entity. There are no additional properties for an Executable Application
entity.

Windows Service
The Windows Service entity represents a Windows Service that (usually) has no runtime user
interface. You can specify Discovery Type of only ServiceName for this type of managed entity.
The Windows Service entity has one additional property shown in the following table.

Windows Service property Description

Windows Service Extension This Boolean property specifies whether the process that generates
Enabled management packs will add a specific extender monitor to the management
pack that checks the status of a Windows service by querying WMI at timed
intervals. The monitor will raise an alert if the service is configured to start
automatically and is not currently running. The monitor will not raise an alert
If the service is disabled or configured to start manually if it is not running, or
when it is stopped.

ASP.NET Application
The ASP.NET Application entity represents an ASP.NET application that runs on an Internet
Information Services (IIS) Web server. You can specify a Discovery Type of only
IISApplicationName for this type of managed entity. The ASP.NET Application entity has the
additional properties shown in the following table.

ASP.NET Application property Description

ASP.NET Extension Enabled This Boolean property specifies if the process that generates
management packs will add specific extender monitors to the
management pack. The default setting is False. When set to True, the
following properties in this table specify the parameters for the
extender monitors.

Exception Error Threshold This property defines the threshold value at which point the extender
monitor will change the health state of the entity to Critical (RED) for
exceptions generated by the entity within a time specified by the
Exception Sample Time Interval property. The default value is 50.

Exception Sample Time Interval This property defines the duration is seconds that the extender monitor
counts exceptions occurring in the entity, and matches this figure to
the Exception Error Threshold and Exception Warning Threshold
values. The default value is 30 seconds.

Exception Warning Threshold This property defines the threshold value at which point the extender
monitor will change the health state of the entity to Warning (YELLOW)
for exceptions generated by the entity within a time specified by the
Exception Sample Time Interval property. The default value is 30.

Performance Error Threshold This property defines the threshold value at which point the extender
monitor will change the health state of the entity to Critical (RED) for
degraded performance measures incurred by the entity within a time
specified by the Performance Sample Time Interval property. The
default value is 50.

Performance Sample Time This property defines the duration is seconds that the extender monitor
Interval counts degraded performance measures incurred by the entity, and
matches this figure to the Performance Error Threshold and
Performance Warning Threshold values. The default value is 30
seconds.

Performance Warning Threshold This property defines the threshold value at which point the extender
monitor will change the health state of the entity to Warning (YELLOW)
for degraded performance measures incurred by the entity within a
time specified by the Performance Sample Time Interval property. The
default value is 30.
Response Time (ms) This property defines the maximum time within which the application
must respond to the request. The default value is 5000 (5 seconds).

These extended properties allow you to specify the behavior of the application in terms of the
intrinsic performance and internal errors that it generates. This is useful for monitoring and
reporting scenarios that ensure the application meets business requirements and Service Level
Agreements (SLAs).

ASP.NET Web Service


The ASP.NET Web Service entity represents a Web service implemented by ASP.NET as an ASXM
service and running on an Internet Information Services (IIS) Web server, or implemented as a
WCF Web service. You can specify a Discovery Type of only IISApplicationName for this type of
managed entity. The ASP.NET Web Service entity has the additional properties shown in the
following table.

ASP.NET Web Service property Description

ASP.NET Extension Enabled This Boolean property specifies if the process that generates
management packs will add specific extender monitors to the
management pack. The default setting is False. When set to True, the
following properties in this table specify the parameters for the
extender monitors.

Exception Error Threshold This property defines the threshold value at which point the extender
monitor will change the health state of the entity to Critical (RED) for
exceptions generated by the entity within a time specified by the
Exception Sample Time Interval property. The default value is 50.

Exception Sample Time Interval This property defines the duration is seconds that the extender monitor
counts exceptions occurring in the entity, and matches this figure to
the Exception Error Threshold and Exception Warning Threshold
values. The default value is 30 seconds.

Exception Warning Threshold This property defines the threshold value at which point the extender
monitor will change the health state of the entity to Warning (YELLOW)
for exceptions generated by the entity within a time specified by the
Exception Sample Time Interval property. The default value is 30.

Performance Error Threshold This property defines the threshold value at which point the extender
monitor will change the health state of the entity to Critical (RED) for
degraded performance measures incurred by the entity within a time
specified by the Performance Sample Time Interval property. The
default value is 50.

Performance Sample Time This property defines the duration is seconds that the extender monitor
Interval counts degraded performance measures incurred by the entity, and
matches this figure to the Performance Error Threshold and
Performance Warning Threshold values. The default value is 30
seconds.
Performance Warning Threshold This property defines the threshold value at which point the extender
monitor will change the health state of the entity to Warning (YELLOW)
for degraded performance measures incurred by the entity within a
time specified by the Performance Sample Time Interval property. The
default value is 30.

Response Time (ms) This property defines the maximum time within which the service must
respond to the request. The default value is 5000 (5 seconds).

These extended properties allow you to specify the behavior of the service in terms of the
intrinsic performance and internal errors that it generates. This is useful for monitoring and
reporting scenarios that ensure the application meets business requirements and Service Level
Agreements (SLAs).

Windows Communication Foundation (WCF) Service


The WCF Service entity represents a Windows Communication Foundation (WCF) Web service.
For this type of managed entity, the architect can specify a Discovery Type of FilePath or
RegistryValue (if the service is self-hosting), IISApplicationName (if hosted by IIS), or
ServiceName (if hosted by a Windows service). There are no additional properties for a WCF
Service entity.
To create external managed entities for the application
1. Open the Visual Studio Toolbox, and then drag an External Managed Entity onto the
designer surface.
2. In the designer, select the external managed entity, and then modify its Name property.
If you cannot see the Properties window, press F4 or right-click the external managed
entity, and then click Properties.
3. Repeat steps 1 and 2 for all the other external managed entities used by the
application.

To model connections between managed entities


1. Open the Visual Studio Toolbox, and then click Connection.
2. Click the managed entity where the connection will start.
3. Click the input port where the connection will end.
4. Edit the Text property of the connection as required.
5. Repeat steps 1 through 4 for all other connections between entities.

You can use External Managed Entities to model services that your application consumes, but
which are not part of your model. You can also use External Managed Entities to split a large
model into smaller models. In this case, the External Managed Entity simply represents the
section that does not appear in the current diagram. It is important to avoid repeating
Managed Entities in more than one section of a management model.

However, you can add only one management model to a solution, and so—in this release—the
management model equates to the solution. If you need to divide your application into
multiple management models, you must create multiple solutions.

Everything outside of the model is classed as external. It is likely that external entities such as
databases, Web sites, and Web services will already be instrumented and managed by other
tools, such as existing management packs (for example, the System Center Operations
Manager pack for SQL Server).

Defining Target Environments for the Application


Target Environments are used to model different deployment environments for the application.
You can associate different types of concrete instrumentation with each target environment. For
example, in a low trust environment, it may not be possible to write to an event log, so writing
events to a file may be the preferred option.
To define target environments for the model:
1. In Management Model Explorer, right-click the top level Management Model entry, and
then click Add New Target Environment. If you cannot see Management Model
Explorer, point to Other Windows on the View menu, and then click Management
Model Explorer.
2. The new target environment appears in the Target Environments section of
Management Model Explorer. Select the new target environment, and then modify its
Name property. If you cannot see the Properties window, press F4 or right-click the
target environment, and then click Properties.
3. Change the values for the types of instrumentation you will use in this target
environment. You can enable each type by setting the value of the corresponding
property to True. You can enable Enterprise Library Logging events, Windows Event Log
events, trace file events, Windows Eventing 6.0 events, Windows Management
Instrumentation (WMI) events, and Windows performance counters.
4. Repeat steps 1 through 3 for all other target environments that the model must
support.

For instrumentation helpers to be generated for managed entities, you must define target
environments and associate them with those managed entities. Validation code checks to
ensure that you have at least one target environment defined for each managed entity, and it
displays a warning if not.

Defining Instrumentation for the Application


The management model defines instrumentation in two ways:

• Abstract instrumentation. This instrumentation is an abstraction of the specific


instrumentation technology being used. By defining abstract events and measures, the
developer can call the abstract instrumentation rather than the specific technology.

• Implementations of the abstract instrumentation. This instrumentation corresponds to


the specific instrumentation technologies and is mapped to the abstract
instrumentation.

Defining Abstract Instrumentation


You can create abstract events and measures as you specify the aspects (health states) of a
model by using the New Aspect Wizard, instead of creating each artifact manually. The New
Aspect Wizard makes it easy to define the health states for an aspect, and then create or select
abstract events and measures that indicate changes to the health state. If you use the New
Aspect Wizard, you will perform the following steps for each aspect:
1. Use the wizard to create the aspect and specify the type of instrumentation (events or a
measure) that provides the state change information.
2. Create the implementations of the events or the measure you specify in the wizard. You
can use the New Event Implementation Wizard or the New Measure Implementation
Wizard to create the implementations, or you can create them manually.
3. Create any parameters you need to pass application-specific values to the
instrumentation that the TSMMD creates. You must do this manually using the
procedure "Modeling abstract event parameters" described later in this topic.

If you decide to manually create the instrumentation for your model by adding each abstract
and concrete implementation individually, you will perform the following steps:
1. Specify the abstract instrumentation (events and measures) for each entity.
2. Specify the implementations of these events and measures for each target environment
for each entity.
3. Map the event and measure implementations to each aspect in the health model.

This section describes how to create both forms of instrumentation (abstract and implemented)
individually. It consists of the following procedures:
This section describes how to create both forms of instrumentation and how to map one to the
other. It consists of the following procedures:
• Modeling abstract events
• Modeling abstract event parameters
• Modeling Enterprise Library logging events
• Modeling event log events

• Modeling Windows Eventing 6.0 events


• Modeling trace files

• Modeling Windows Management Instrumentation (WMI events)


• Modeling an abstract measure
• Modeling performance counters

To model an abstract event


1. In Management Model Explorer, click to expand the node of the managed entity for
which you want to add instrumentation.
2. Right-click the Management Instrumentation node, and then click Add New Event.
3. Select the new event in the Events section of the model.
4. In the Properties window, select a value for the Instrumentation Level property of the
event; you can select Coarse, Fine, or Debug. If you cannot see the Properties window,
press F4 or right-click the new event, and then click Properties.
5. Modify the value of the Name property of the event.
6. Repeat steps 2 through 5 for any other abstract events you require.

To model an abstract event parameter


1. In Management Model Explorer, click to expand the Management Instrumentation
node of the managed entity, and expand the Events node within it.
2. Right-click the abstract event, and then click Add New Event Parameter.
3. In the Properties window, modify the value of the Index property for the event. This
specifies the index of the placeholder (starting at 1) in the message template for the
value of this parameter. If you cannot see the Properties window, press F4 or right-click
the new event parameter, and then click Properties.
4. Modify the Name property for the event parameter.
5. Select the data type for the event parameter. You can select DateTime, Double, Int32,
Int64, or String.
6. Repeat steps 2 through 5 for any other event parameters you require.

An abstract event has two properties that you can set, as shown in the following table.

Abstract Event Description


property

Instrumentation Level This property specifies the level at which the entity will raise the event. The options
are Coarse (all operations, the default), Fine (diagnostic and debug operations
only), and Debug (debug operations only). For information about how this setting
affects the behavior of an application, see Appendix A.

Name This property contains the name of the abstract event definition.

In addition, you will define one or more parameters for each abstract event. For each
parameter, architects set the two properties shown in the following table.

Event Parameter property Description

Name This property contains the name of the abstract event definition.

Index This property is an integer value that specifies which placeholder in the
message template the value of the parameter will replace.

Type The data type of the parameter. The available types are DateTime, Double,
Int32, Int64, and String (the default).

Event parameter names should use title-style capitalization (the first letter must be
capitalized). Validation code checks for this and displays an error message if this is not the
case. Also, if multiple event parameters are used, they should be numbered in increasing order
from 1, with no duplicates and no missing integers. Again, validation code checks for this.

To model an abstract measure


1. In Management Model Explorer, click to expand the node of the managed entity for
which you want to add instrumentation.
2. Right-click the Management Instrumentation node, and then click Add New Measure.
3. Select the new measure in the Measures section of the model.
4. Select a value for the Instrumentation Level property of the measure; you can select
Coarse, Fine, or Debug. If you cannot see the Properties window, press F4 or right-click
the new measure, and then click Properties.
5. Modify the Name property for the measure.
6. Repeat steps 2 through 5 for any other abstract measures you need to model.

An abstract measure has two properties that you can set, as shown in the following table.
Abstract Measure Description
property

Instrumentation Level This property specifies the level at which the entity will update the counter. The
options are Coarse (all operations, the default), Fine (diagnostic and debug
operations only), and Debug (debug operations only). For information about how
setting this affects the behavior of an application, see Appendix A.

Name This property contains the name of the abstract measure definition.

Defining Instrumentation Implementations


You can create the concrete implementations of the abstract events and measures you defined
in the model using the New Event Implementation Wizard or the New Measure Implementation
Wizard. To start the Wizard, right-click on an existing abstract event or measure in the
Management Model Explorer window, then click New Event Implementation Wizard or New
Measure Implementation Wizard.
Alternatively, you can create them manually as described in the following procedures.
To model concrete event instrumentation
1. If you need to model an Enterprise Library Log Entry event, right-click the abstract
event, and then click Add New Enterprise Library Log Entry.
2. In the Configurable Implementations section, select the Enterprise Library Log Entry
event you created.
3. Modify the properties of the event, specifying values for the Categories, Event ID,
Message, Name, Priority, Severity and Title.
4. Repeat steps 1 through 3 for any other Enterprise Library Log Entry events you need to
model.
5. If you need to model an Event Log event, right-click the abstract event, and then click
Add New Event Log Event.
6. Click Configurable Implementations, and then select the Event Log event you created.
7. Modify the properties of the Event Log event, specifying the Category, Event ID, Log,
Name, Severity and Source.
8. Specify a value for the Message Template property of the event. This is a template
containing placeholders (such as %1) for the values of the event parameters. You must
include the same number of placeholders as there are parameters for the abstract
event, and number them in increasing order starting from 1 with no duplicates and no
missing integers.
9. Repeat steps 5 through 8 for the other Event Log events you need to model.
10. If you need to model a Windows Eventing 6.0 event, right-click the abstract event, and
then click Add New Windows Eventing 6 Event.
11. In the Configurable Implementations section, select the Windows Eventing 6.0 Event
you created.
12. Modify the properties of the event, specifying values for the Channel, Level, Name,
Operation, Provider, Task, and Value.
13. Specify a value for the Message Template property of the event. This is a template
containing placeholders (such as %1) for the values of the event parameters. You must
include the same number of placeholders as there are parameters for the abstract
event, and number them in increasing order starting from 1 with no duplicates and no
missing integers.
14. Repeat steps 10 through 13 for any other event log events you need to model.
15. If you need to model a trace file entry, right-click the abstract event, and then click Add
New Trace File Entry.
16. Click Configurable Implementations, and then select the trace file entry you created.
17. Modify the properties of the trace file entry, specifying the Name property.
18. Specify a value for the Message Template property of the event. This is a template
containing placeholders (such as %1) for the values of the event parameters. You must
include the same number of placeholders as there are parameters for the abstract
event, and number them in increasing order starting from 1 with no duplicates and no
missing integers.
19. Repeat steps 15 through 18 for the other trace file entries you need to model.
20. If you need to model a WMI event, right-click the abstract event, and then click Add
New WMIEvent.
21. Click Configurable Implementations, and then select the WMI event you created.
22. Modify the properties of the WMI event, specifying the Name property.
23. Repeat step 20 through 12 for the other WMI events you need to model.

The following table shows the properties of an Enterprise Library Event implementation.

Enterprise Library Event Description


property

Categories This property specifies a list of Categories that allows you to filter logging
events using a Category Filter in the Enterprise Library Logging Application
Block configuration. Separate each category name with a carriage return.

Event ID This property specifies the identifier for the event, and should be different from
any existing events.

Message This property specifies the text that Enterprise Library Logging Application
Block will include in the log message it generates.

Name This property contains the name of the Enterprise Library Logging Event
implementation. The name should start with a capital letter, and can contain
only alphanumeric characters (letters and numbers) and underscores.

Priority This property specifies the priority of the event using a positive or negative
numeric value. The priority allows you to filter logging events using a Priority
Filter in the Enterprise Library Logging Application Block configuration.

Severity This property specifies the severity of the error Select the Severity for the
event. You can select Critical, Error (these are equivalent to Windows Event
Log Error events), Information, Resume, Start, Stop, Suspend, Transfer,
Verbose (these are equivalent to Windows Event Log Information events), or
Warning (equivalent to Windows Event Log Warning events).

Title This property specifies the text that Enterprise Library Logging Application
Block will use as the title of the log message it generates.

The following table shows the properties of an Event Log Event implementation.

Event Log Description


Event
property

Category This property contains a value list that allows you to filter individual events.

Event ID This property specifies the identifier for the event, and should be different from any existing
events.

Log This property specifies the target Windows Event Log name such as Application, or the
name of a custom Event Log.

Name This property contains the name of the Event Log Event implementation. The name should
start with a capital letter, and can contain only alphanumeric characters (letters and
numbers) and underscores.

Severity This property specifies the severity of the error, which sets the type of icon shown in
Windows Event Log and is useful for filtering events in a monitoring tool. The options
available are Error, Warning, Information, SuccessAudit, and FailureAudit.

Source This property contains the name to pass to the event system as the source of the error or
event.

Message This property is a template containing placeholders where the event system will insert the
Template values from event parameters when raising the event. If the abstract event defines any
parameters, you must include placeholders for the value of each parameter. The
placeholders must start with %1 and run consecutively to the number of parameters defined
for the event.
The following table shows the properties of a Windows Eventing 6.0 Event implementation.

Windows Eventing 6.0 Description


Event property

Channel This property specifies the channel to use to deliver the event. The channels
you can use are Operational, TraceClassic, System, Application, Security,
Analytic, and Debug. Generally, you should use the three channels that target
the Event Log. These are Application, System, and Security.

Level This property specifies the severity or importance of the event. The values you
can select are Error, Critical, Warning, Informational, and Verbose. The
usual approach is to select Error for events that cause a transition to a Red
(failed) state, Warning for events that cause a transition to a Yellow (degraded)
state, and Information for events that cause a transition to a Green (working
normally) state.

Message Template This property is a template containing placeholders where the event system
will insert the values from event parameters when raising the event. If the
abstract event defines any parameters, you must include placeholders for the
value of each parameter. The placeholders must start with %1 and run
consecutively to the number of parameters defined for the event.

Name This property contains the name of the Windows Eventing 6.0 Event
implementation. The name should start with a capital letter, and can contain
only alphanumeric characters (letters and numbers) and underscores.

Operation This property indicates the type of low-level operation the application was
executing when the event occurred. The values you can choose are Info,
Start, Stop, DC_Start, DC_Stop, Extension, Reply, Resume, Suspend, and
Send.

Provider This property contains the value passed to the event system to indicate the
provider, and provides an indication to administrators and operators of the
source of the event. The default value is a combination of the name of the
model and the name of the current managed entity.

Task This property contains additional information that may be useful to


administrators and operators to indicate what the application was doing when
the event occurred; for example, "Create Order", "Import Data", or "Application
Starting".

Value This property is a unique identifier for the event, and should therefore be
different from any other events so that the monitoring system can filter on this
value.

The following table shows the single property of a Trace File Entry implementation.

Trace File Description


Entry
property
Name This property contains the name of the Trace File Entry implementation.

The following table shows the properties of a WMI Event implementation.

WMI Event Description


property

Name This property contains the name of the WMI Event implementation.

Namespace This property contains the WMI namespace within which the event will reside.

Query This property contains a query that identifies the event.

To model a performance counter


1. Right-click the abstract measure, and then click Add New Performance Counter.
2. Click Configurable Implementations, and then click the performance counter you
created.
3. Modify the properties of the performance counter, specifying the Counter Category
Name and Counter Object Name properties. The counter name must start with a
capital letter.
4. Select a value for the Counter Type property of the new performance counter that
indicates the way that it exposes the data, such as ElapsedTime or
RateOfCountsPerSecond32.
5. Modify the values of the Name property of the new performance counter.
6. Repeat steps 1 through 5 for all other performance counters you need.

The following table shows the properties of a Performance Counter implementation.

Performance Counter Description


property

Counter Category Name This optional property contains the category name of the Windows Performance
Counter that supplies the values for this measure.

Counter Object Name This property contains the name of the Windows Performance Counter that
supplies the values for this measure. It must start with a capital letter.

Counter Type This property specifies the type of counter to use in terms of the way that it
aggregates or measures the target object, such as AverageBase or
ElapsedTime.

Name This property contains the name of the Performance Counter implementation.

Now that you have modeled concrete events and performance counters, you can map them to
the trust levels associated with the managed entity.
Discovering Existing Instrumentation in an Application
The Team System Management Model Designer Power Tool can discover instrumentation in
assemblies that are part of an existing application solution. The process will discover most
common instances of Windows Event Log Events, WMI Events, Enterprise Library Logging
entries, and Windows Performance Counters. The assemblies must reside in one or more
projects located in the Solution Items folder of the Management Model solution. The TSMMD
will compile the projects automatically when required.
To discover existing instrumentation:
1. Open the TSMMD solution that contains the application project(s) from which you want
to discover existing instrumentation. If you have not yet created a TSMMD solution
containing the application project(s), do the following:
a. Create a new TSMMD solution by following the steps in the topic "Creating a
New Management Model."
b. In Solution Explorer, right-click the Solution Items folder, point to Add, and then
click Existing Project.
c. Navigate to the existing project, and then click Open to add it to the TSMMD
solution.
d. Repeat the steps b and c to add any more required projects.
2. Ensure that the TSMMD guidance package is enabled:
a. On the Tools menu, click Guidance Package Manager.
b. In the Guidance Package Manager dialog box, click the Enable/Disable
Packages button.
c. In the Enable and Disable Packages dialog box, select the TSMMD
Instrumentation and TSMMD Management Pack Generation check boxes.
d. In the Enable and Disable Packages dialog box, click OK, and then click Close in
the Guidance Package Manager dialog box.
3. Open an existing .tsmmd management model file into the designer, and then open
Management Model Explorer. If you cannot see Management Model Explorer, click the
View menu, point to Other Windows on the View menu, and then click Management
Model Explorer.
4. Right-click the top-level node in Management Model Explorer, and then click Discover
Instrumentation.
5. The Discover Instrumentation Wizard opens, showing a list of all assemblies in all
projects with a check box next to each one. The check boxes for assemblies that will be
searched are already set. You can change the settings to add or remove individual
assemblies from the discovery process as required.
6. Select the type of instrumentation you want to discover in the Instrumentation Type
option list under the list of assemblies. You can select Event Log Event, WMI Event,
Performance Counter Measure, or Enterprise Library Logging, depending on whether
the assemblies you select contain instances of these types of instrumentation. Figure 2
shows the Discover Instrumentation Wizard.

Figure 2
The Discover Instrumentation Wizard

7. Click the Discover button. The Discovery Results window opens in Visual Studio showing
a list of all the discovered instrumentation. Figure 2 shows the Discovery Results
window after discovering Event Log Events instrumentation.

After you discover the instrumentation within one or more projects, you must map that
instrumentation to the appropriate managed entities in the management model. The following
procedure describes this process.
To map discovered instrumentation to a model:
1. Perform the steps of the previous procedure to generate a list of discovered
instrumentation using the TSMMD Discover Instrumentation recipe.
2. Locate the rows containing the instrumentation you want to import. You can filter the
list of instrumentation rows using the drop-down lists at the top of some of the columns
to help locate rows, and then click a column heading to sort the rows based on the
values in that column.
3. If you are not sure of the actual implementation of an instrumentation item, such as an
event or performance counter, right-click that item in the list of rows, and then click
one of the Go To options. For example, with Enterprise Library Logging, you can go to
the source code line that makes the call into the Logging Application Block or go to the
line that writes the logging entry.
4. Some of the instrumentation rows may contain one or more values that the discovery
process could not resolve. It marks these values as <Not Resolved>. Some of the
unresolved values may be optional (such as the instance name of some performance
counters), while others are mandatory. You must provide these values as part of the
mapping process.
5. Select the rows in the Discovery Results window that contain the discovered
instrumentation items you want to import into your management model. You can press
SHIFT+CTRL while clicking the list to select multiple items.
6. Now you can specify the mapping between the selected instrumentation items in the
Discovery Results window and the management model entities. To map one or more
instrumentation items to a specific managed entity, right-click the selected item rows,
click Quick Map, and then click the name of the entity. If none of the rows contains
unresolved mandatory items, you will see the managed entity name appear in the
Mapped To column.
7. If any row contains an unresolved mandatory item, you will see a dialog box that asks if
you want to resolve mandatory properties. Click Yes to display a dialog box where you
can provide values to override those in all the selected rows in the discovered
instrumentation list. For example, Figure 3 shows the Event Details dialog box, where
you specify the mandatory Source, Severity, and Log Name properties for an Enterprise
Library Logging event.
Figure 3
The Event Details dialog box for specifying unresolved mandatory instrumentation
properties

8. Alternatively, you can force the TSMMD to display the mapping details window;
perhaps because you want to change some values for the properties of the discovered
instrumentation or there are unresolved mandatory properties for which you know you
must provide values. In these cases, right-click the selected rows in the Discovery
Results window, click Map to open the mapping details window and enter the relevant
values, and then click OK.
9. The TSMMD adds the instrumentation to the Discovered Instrumentation section of
the selected management entity in the Management Model Explorer. Open the
Discovered Instrumentation section in Management Model Explorer to see the result,
to rename events or measures, and to make any remaining edits you require to the
properties.

The following tables describe the properties that you can set or edit for discovered
instrumentation. The Events section can contain definitions of Event Log Events and WMI
Events. For an existing or imported Event Log Event, the architect defines or edits the properties
shown in the following table.

Existing Event Log Description


Event property

Description This property contains a description of the existing Event Log Event.

Event ID This property specifies the identifier for the event, and should be different from any
existing events.

IsDiscovered This Boolean property indicates if the Event Log Event was discovered by the
TSMMD or entered manually into the model.

Log This property specifies the target Windows Event Log name such as Application,
or the name of a custom Event Log.

Message This property contains the error message for this event.

Name This property contains the name of the existing Event Log Event.

Severity This property specifies the severity of the error, which sets the type of icon shown in
Windows Event Log and is useful for filtering events in a monitoring tool. The
options available are Error, Warning, Information, SuccessAudit, and
FailureAudit.

Source This property contains the name to pass to the event system as the source of the
error or event.

For an existing or imported WMI Event, the architect defines or edits the properties shown in
the following table.
Existing WMI Event Description
property

Description This property contains a description of the existing WMI Event.

IsDiscovered This Boolean property indicates if the WMI Event was discovered by the TSMMD or
entered manually into the model.

Name This property contains the name of the existing WMI Event.

Namespace This property contains the WMI namespace within which the event will reside.

Query This property contains a query that identifies the event.

The Measures section can contain only definitions of Performance Counters. For an existing or
imported Performance Counter, the architect defines or edits the properties shown in the
following table.

Existing Description
Performance
Counter property

Counter Category This optional property contains the category name of the Windows Performance
Name Counter that supplies the values for this measure.

Counter Instance This property contains the instance name of the Windows Performance Counter that
Name supplies the values for this measure.

Counter Object This property contains the name of the Windows Performance Counter that supplies
Name the values for this measure. It must start with a capital letter.

Counter Object Type This property specifies the type of counter to use in terms of the way that it
aggregates or measures the target object, such as AverageBase or ElapsedTime.

Description This property contains a description of the existing Performance Counter.

IsDiscovered This Boolean property indicates if the Performance Counter was discovered by the
TSMMD or entered manually into the model.

Name This property contains the name of the existing Performance Counter.

Visible Name This property indicates the name of the counter as seen by the operating system.

Discovered instrumentation (either manually defined or automatically discovered by the


TSMMD) cannot be mapped to a target environment.

Creating Health Definitions


You can use health definitions to provide additional information about the model. The
information can be used to create an Operations Manager Management Pack.
Creating health definitions for each managed entity relies on creating aspects, each one of
which can have a health state. You can create aspects for a model using the New Aspect Wizard,
which simplifies the process of defining the aspect and health states, and specifying or creating
the abstract events or measure for each aspect. To start the Wizard, right-click on a managed
entity in the designer window or in the Management Model Explorer window, then click New
Aspect Wizard.
Alternatively, you can create new aspects directly in the Management Model Explorer window
by defining each aspect and health state individually, then selecting the abstract events or
measure that provides the information about state changes for the new aspect. The following
procedure explains how to add aspects manually to a TSMMD model.
To model health definitions for a managed entity
1. In Management Model Explorer, expand the managed entity node for which you want
to define aspects. If you cannot see Management Model Explorer, point to Other
Windows on the View menu, and then click Management Model Explorer.
2. Right-click the Health Definition node, and then click Add New Aspect.
3. In the Properties window, modify the value of the Name property for the aspect and
enter information for the Knowledgebase property that will assist operators and
administrators. If you cannot see the Properties window, press F4 or right-click the
Health Definition node, and then click Properties.
4. System Center Operations Manager categorizes aspects into four categories:
Availability, Configuration, Security, and Performance. In the Properties window, select
the appropriate category as the value of the Type property for the aspect.
5. In the Aspects section of Management Model Explorer, right-click the new aspect node,
and then click Add New Green Health State.
6. Expand the new aspect node to show the three health states, and then expand the
Green Health State node to show the Health Formula (which is currently empty).
7. If the indicators for state transitions for this aspect are events, right-click the Green
Health State node, and then click Add New Event Formula.
8. If the indicators for state transitions for this aspect are measures (performance
counters), right-click the Green Health State node, and then click Add New Measure
Formula.

You cannot mix events and measures in an aspect. All the states you define for an
aspect must be either events or measures (performance counters).

9. If you added a new Event Formula, use the Event property to specify the event that will
act as the indicator for this state transition. You can select an abstract event that you
previously defined in the Management Instrumentation section of this entity.
Alternatively, you can select an event discovered by the TSMMD or defined in the
Discovered Instrumentation section.
10. If you added a new Measure Formula, use the Measure Formula property to specify the
measure that will act as the indicator for this state transition. You can select an abstract
measure that you previously defined in the Management Instrumentation section of
this entity. Alternatively, you can select a performance counter discovered by the
TSMMD or defined in the Discovered Instrumentation section.
11. For a Measure Formula, you must also specify the conditions that trigger a state
transition. Select the Measure Formula node and specify values for the Upper Bound
and Lower Bound properties.
12. Repeat steps 5 through 11 to specify the yellow and red states for the aspect. In
addition to the mandatory green health state, you can specify either or both of the
yellow and red health states for a managed entity.
13. Repeat steps 2 through 12 to add any other aspects you require to the managed entity.
14. Repeat the complete procedure to add aspects to all other managed entities in the
model.

Validating the Management Model


Creating a management model using the TSMMD tool can be a fairly lengthy and complex
process. Before the model can be used by others, such as the development team for the
application, you must ensure that it is complete. The TSMMD tool can perform a number of
checks on the model to ensure that it is internally consistent.
To validate the management model
1. To validate the complete model, right-click the model designer surface or on any of the
nodes in the model in Management Model Explorer, and then click Validate All. If you
cannot see Management Model Explorer, point to Other Windows on the View, and
then click Management Model Explorer.
2. To validate a section of the model (useful as you define sections of instrumentation or
individual health aspects), right-click the parent node of the section you want to
validate, and then click Validate.
3. The TSMMD validates the model, or the selected node and its child nodes, and reports
the result in the Visual Studio Output window. If there are validation errors and/or
warnings, they appear in the Visual Studio Error List window.
In addition to validating the management model itself, after you create the application itself, it
is also possible to verify that the application code calls the instrumentation code. For more
information, see Chapter 8 of this guide.

Management Model Guidelines


When creating management models using the TSMMD tool, you should consider the following
guidelines:
• All managed entities and external managed entities must have unique names. An
external managed entity cannot have the same name as a managed entity.

• Entry points to the model from other managed entities not represented in the model
should be shown as unmanaged entities.
• If multiple models are used to represent a system, each managed entity should only be
represented in one model; this managed entity can be represented as an external
managed entity in other models.

Northern Electronics Scenario


The solutions architect is now in a position to define the management model for the application.
The architect decides that the application will consist of two solutions and will split the model
across those two solutions, as shown in Figure 1.
Figure 1
Solutions used in the Northern Electronics example
The solutions architect then creates management models for each solution, as shown in Figures
2 and 3.

Figure 2
The Transport Consolidation Solution
Figure 3
The Shipping Solution

Summary
This chapter described how to use the TSMMD tool to create a management model, and it
provided guidelines for effective use of the TSMMD tool. It also showed how the TSMMD tool
was used to model the solutions in the Northern Electronics Scenario.
Section 3
Developing for Operations
This section focuses on the developer tasks necessary for creating well-instrumented
manageable applications. It describes how to create reusable instrumentation helpers from the
model defined in the Team System Management Model Designer Power Tool (TSMMD) and
discusses the instrumentation artifacts that are generated. It examines the developer tasks that
are necessary to create and manage event log, Windows Management Instrumentation (WMI),
Eventing 6.0, and performance counter instrumentation. The section also includes a chapter
about building install packages for instrumentation; however, this chapter is not complete in the
preliminary version of this guide.
This section should be of use primarily to application and instrumentation developers.
Chapter 8, "Creating Reusable Instrumentation Helpers"
Chapter 9, "Event Log Instrumentation"
Chapter 10, "WMI Instrumentation"
Chapter 11, "Windows Eventing 6.0 Instrumentation"
Chapter 12, "Performance Counters Instrumentation"
Chapter 13, "Building Install Packages"
Chapter 8
Creating Reusable Instrumentation
Helpers
After the architect defines the management model for the application, it is up to the developer
to write instrumentation code that reflects the management model. It is recommended that you
isolate instrumentation in an instrumentation helper. This chapter describes how to use the
guidance automation supplied with the Team System Management Model Designer Power Tool
(TSMMD) to automatically create the instrumentation helper, and it includes details about the
artifacts that are created. It then discusses how to consume the instrumentation from an
application.

The guidance automation included with the TSMMD tool simplifies the process of creating
instrumentation helper artifacts. However, you can use the information contained in this
chapter to manually create your own instrumentation helper classes.

Creating Instrumentation Helper Classes


After you determine that the model has no errors (as shown in Chapter 7 of this guide), you can
generate the instrumentation helper classes.
To generate instrumentation helper classes
1. If you have previously generated the instrumentation code from your model, you
should delete it before you regenerate the code. In Visual Studio Solution Explorer,
select and delete the Instrumentation subfolder and all its contents.
2. In Visual Studio, make sure that the TSMMD guidance package is enabled:
a. On the Tools menu, click Guidance Package Manager.
b. In the Guidance Package Manager dialog box, click the Enable/Disable Packages
button.
c. In the Enable and Disable Packages dialog box, select the TSMMD
Instrumentation and TSMMD Management Pack Generation check boxes.
d. In the Enable and Disable Packages dialog box, click OK, and then click Close in
the Guidance Package Manager dialog box.
3. In Management Model Explorer, right-click the top-level entry, and then click
Generate Instrumentation Helper. Alternatively, right-click anywhere on the model
designer surface, and then click Generate Instrumentation Helper.
4. The guidance recipe first validates the entire model and then (providing there are no
errors) automatically generates the instrumentation projects and artifacts in the
Instrumentation solution folder. Finally, it opens the file
InstrumentationConfiguration.config in the editor window so that you can specify
the run-time target environments and instrumentation granularities for each
managed entity in the application.

Figure 1
Instrumentation Helper code generated by the TSMMD guidance automation

Instrumentation Solution Folder


A fundamental principle behind the design of manageable applications is to abstract
instrumentation, meaning that applications call abstract events and measures, which are
mapped to concrete implementations of these events and measures. This abstraction is
reflected in the code, with a separate solution folder named Instrumentation. This folder
captures the instrumentation defined in the management model, and after the artifacts in this
folder have been created, it should not be necessary to modify them. This allows you to
separate application design from the instrumentation, and in some cases have a separate
instrumentation developer responsible for creating this solution artifact.
The Instrumentation solution folder contains instrumentation projects and a lib folder, which
contains the file Microsoft.Practices.DFO.Guidance.Configuration.dll.
Three types of instrumentation projects are created as artifacts in the Instrumentation solution
folder:
• API projects
• Implementation projects
• Technology projects

The next sections describe each of these types of projects in more detail.

API Projects
One API project is created for each managed entity. Each of these projects contains an abstract
class. The abstract class is a helper class that defines the following:

• It defines one protected constructor receiving a ManagedEntityHealthElement as


parameter.
• It defines a static GetInstance method that returns an instance of this API class.
• It defines one public CanRaise method (named "CanRaise" + <EventName>) for each
abstract event defined in the managed entity. This method returns true if the event can
be raised according to the instrumentation; otherwise, it returns false.
• It defines one public Raise method (named "Raise" + <Event Name>) for each abstract
event defined in the managed entity. This method calls the concrete instrumentation if
the event can be raised according configuration.
• It defines one protected abstract DoRaise method (named "DoRaise" + <Event Name>)
for each abstract event defined in the managed entity. This method is overridden on
each concrete instrumentation class.
• It defines one public CanIncrement method (named "CanIncrement" + <Measure
Name>) for each abstract measure defined in the managed entity. This method returns
true if measure can be incremented according instrumentation; otherwise, it returns
false.
• It defines one public Increment method (named "Increment" + <Measure Name>) and
one public IncrementBy method (named "IncrementBy" + <Measure Name>) for
each abstract measure defined in the managed entity. This method calls the concrete
instrumentation if the event can be incremented according configuration. The
difference between Increment and IncrementBy method is that the second receives
one Int parameter to specify how much measure wants to be incremented.
• It defines one protected abstract DoIncrement method (named "DoIncrement" +
<Measure Name>) and one protected abstract DoIncrementBy method (named
"DoIncrementBy" + <Measure Name>) for each abstract measure defined in the
managed entity. These methods are overridden on each concrete instrumentation class.

The guidance automation in the TSMMD names the API projects ManagedEntityNameAPI.

One implementation project is created for each managed entity’s trust level. Each of these
projects contains one class as the concrete implementation of the API class previously
explained. This concrete helper class extends the API class and defines the following:

• It defines one public constructor calling the base constructor.


• It defines all the implementations of the abstract methods defined on the base class.
The implementation of these methods depends on the type of event or measure chosen
on the concrete implementation of the event or measure associated to the trust level.
As an example, suppose you have a managed entity with one abstract event defined
and then you add two concrete implementations for this event: a WMIEvent and an
EventLogEvent. After this, you define two trust levels named Medium Trust and High
Trust. You then associate the concrete WMIEvent to the Medium Trust level and the
concrete EventLogEvent to the High Trust level. In this case, you should create one
project with one API class with the corresponding methods for the abstract event and
two projects with one class, each containing different concrete implementations of the
abstract event defined on the API class.

The guidance automation in the TSMMD names the implementation project


ManagedEntityName.TargetEnvironmentName.Impl.

Technology Projects
One technology project is created for each technology used. Exactly what each technology
project contains depends on the technology. This section describes the three technologies
currently represented in the TSMMD tool: event logs, Windows Management Instrumentation
(WMI) events, and performance counters.

For more information about how the event logs, WMI events, and performance counters are
used, see Chapters 9, 10, and 11 of this guide.

There is no technology project for Enterprise Library Logging events. The TSMMD generates
the code required to create logging entries within the API helper classes.

Event Log Project


An event log project contains the following:
• It contains one *.mc file for each source defined on eventLogEvents across all entities.
Each of these files contains one entry for each eventLogEvent defined for that source.
• It contains one EventMessages.cmd file.
• It contains one EventLogEventsInstaller class.

The guidance automation provided with this guide names the event log project
EventLogEventsInstaller.

Windows Eventing 6.0 Project


A Windows Eventing 6.0 project contains the following:

• It contains one EventingResourceComplier.cmd file.


• It contains one EventsDeclaration.man XML manifest file.

The guidance automation provided with this guide names the Windows Eventing 6.0 project
WindowsEventing6EventsInstaller.

The TSMMD can create a Windows Eventing 6.0 View file that administrators can use to create
a custom view in Windows Event Log in Windows Vista and Windows Server 2008 to view
events generated by a TSMMD-based application.

WMI Project
A WMI project contains the following:

• It contains one class for each WMI event defined across managed entities.
• It contains one WmiEventsInstaller class.

The guidance automation provided with this guide names the WMI project
WmiEventsInstaller.

Performance Counter Project


A performance counter project contains the following:

• It contains one PerformanceCountersIntaller class.

The guidance automation provided with this guide names the performance counter project
PerformanceCountersInstaller.

Using the Instrumentation Helpers


After the helper classes are created, you can use the instrumentation code by calling
instrumentation methods for the generated API classes. At run time, configuration of the
application will determine which implementation of instrumentation should be used. You do not
need to be aware of the application's configuration during development; instead, the logic to
apply configuration is in the API helper classes that are generated.
Your application code should only call abstract events and measures and the instrumentation
helper code will ensure that the corresponding events and performance counters are used, as
defined in the instrumentation model.
Abstract events have three methods in their corresponding API class:

• DoRaise<eventName>(<eventParameters>). This is an abstract method that should be


implemented by subclasses. The implementation depends on the type of event (event
log, WMI, or trace file entry).
• CanRaise<eventName>(). This method returns true or false, depending on settings in
the configuration file. For example, if an event is defined as fine, and the
instrumentation level in the configuration file is set to coarse, this method returns false.
• Raise<eventName>(<eventParameters>). If configuration settings allow the event to be
raised, this method raises the event by calling the concrete implementation.

Abstract measures have five methods in their corresponding API class:

• DoIncrement<measureName>(). This is an abstract method that should be


implemented by subclasses.
• CanIncrement<measureName>(). This method returns true or false, according to
settings in the configuration file.
• Increment<measureName>(). If configuration settings allow the measure to be
incremented, this method increments the measure by calling the concrete
implementation.
• DoIncrementBy<measureName>(<incrementQuantity>). This is an abstract method that
should be implemented by subclasses.
• IncrementBy<measureName>(<incrementQuantity>). If configuration settings allow the
measure to be incremented, this method increments the measure by the quantity
defined in the <incrementQuantity> parameter.

Verifying That Instrumentation Code is called from the Application


After you call the instrumentation code from your application, you can perform a validation
check in Visual Studio to check that the instrumentation methods of the generated API helper
classes are called from the application code.
To validate instrumentation
1. In Visual Studio, click the Solution Explorer tab.
2. Right-click the model file, and then click Verify Instrumentation Coverage.

The results of the validation check appear in the Output window. Figure 2 shows a case where
helper methods are not called from the application.
Figure 5
Error list generated when Verify Instrumentation Coverage runs

You can use the validation check to provide a checklist of tasks when instrumenting your
application. The TSMMD can verify coverage for applications written in Visual Basic and C#. If
you create your application using any other language, the TSMMD will not be able to locate
calls to the instrumentation, and will report an error.

An additional limitation in this release is that the TSMMD cannot discover instrumentation
calls made from an ASP.NET Web application written in Visual Basic.

Summary
This chapter described how to generate instrumentation helper classes for an application, and
how to call the application code from the application. By starting with a management model
defined in TSMMD, you can automatically create the instrumentation code you require, and
then call the abstract events from your application code. The instrumentation helpers ensure
that the correct instrumentation technologies are used.
Chapter 9
Event Log Instrumentation
In Windows, an event is defined as any significant occurrence—whether in the operating system
or in an application—that requires users to be notified. Critical events are sent to the user in the
form of an immediate message on the screen. Other event notifications are written to one of
several event logs that record the information for future reference.
Event logging in Microsoft Windows provides a standard, centralized way for you to have your
applications record important software and hardware events. Operations staff can access events
written to the event logs using the Event Viewer and use them to diagnose application
problems.

This chapter focuses on the eventing mechanism used in versions of Windows earlier than
Windows Server 2003. Windows Vista uses a different eventing mechanism, Eventing 6.0, as
will future versions of Windows. For information about Eventing 6.0, see Chapter 11 of this
guide.

By default, there are three event logs available:


• System log. This tracks events that occur on system components—for example, a
problem with a driver.
• Security log. This tracks security changes and possible breaches.
• Application log. This tracks events that occur in an application.

In addition to these logs, other programs, such as Active Directory, may create their own default
logs. You can also create your own custom logs for use with your own applications.
This chapter demonstrates how developers can create event log events in code and ensure that
they are written to the appropriate event log. Where appropriate, code examples reflect the
code used in the Northern Electronics Transport Consolidation Solution.

Not all of the event log instrumentation code described in this chapter is implemented in the
instrumentation helpers generated by the TSMMD tool. For example, no code is generated to
clear existing event logs or to delete event logs. However, it is still included in this chapter
because it may be required.
Installing Event Log Functionality
Before you can write event log entries, you must specify settings for the event log in the
Windows registry. These changes require administrative rights over the local computer, so they
should usually be performed when the application is installed instead of at run time. This section
describes how to use the EventLogInstaller class to install event log functionality for your
application.

Event Sources
One of the primary responsibilities of the EventLogInstaller class is to create an event source for
the application. Event sources are used to uniquely identify a source of events in the event log.
They are defined in the registry under
HKLM\System\CurrentControlSet\Services\EventLog\EventLogName.
Typically, an event source will be named after the application or managed entity that the event
arose from. Figure 1 shows an event in Event Viewer, with the source value for the event
highlighted.
Figure 1
Event log entry with the event source highlighted
By default, an event source for an application is defined in the Windows Application log.
However, it is possible to specify different logs, including custom event logs. For more
information, see "Using Custom Event Logs" later in this chapter.

The EventLogInstaller class can install event logs only on the local computer.

It is common for the source to be the name of the application or another identifying string. Any
attempt to create a duplicated Source value will result in an exception. However, a single event
log can be associated with multiple sources.
Using the EventLogInstaller class
To install an event log, you should create a project installer class that inherits from Installer and
set the RunInstallerAttribute for the class to true. Within your project, create an
EventLogInstaller instance for each event source and add the instance to your project installer
class.

When the install utility is called, it looks at the RunInstallerAttribute. If this attribute is set to
true, the utility installs all the items in the Installers collection associated with your project
installer. If RunInstallerAttribute is false, the utility ignores the project installer.

You modify properties of an EventLogInstaller instance either before or after adding the
instance to the Installers collection of your project installer. You must set the Source property if
your application will be writing to the event log.
If the specified source already exists when you set the Source property, EventLogInstaller
deletes the previous source and recreates it, assigning the source to the log you specify in the
Log property.
Typically, you would set the following additional properties:

• Log. This property is the event log that events will be written to. If it is not set, the
event source is registered to the Application log.
• UninstallAction. This property gets or sets a value that indicates whether the installer
tool (Installutil.exe) should remove the event log or leave it in its installed state at
uninstall time.
• CategoryResourceFile. This property identifies a category resource file, which is used to
write events with localized category strings. It should only be used if you are creating
events with categories.
• CategoryCount. This property sets (and gets) the number of categories in the category
resource file. It should only be used if you are creating events with categories.
• ParameterResourceFile. This property gets or sets the path of the resource file that
contains message parameter strings for the source. It is used when you want to
configure an event log source to write localized event messages with inserted
parameter strings.
• MessageResourceFile. This gets or sets the path of the resource file that contains
message formatting strings for the source. It is used when you want to configure an
event log source to write localized event messages.

These last four properties in the preceding list provide a lot of flexibility in creating events that
are useful for manageability purposes. By using message resource files, categories, and inserting
parameters, you can create messages with more useful information, and manageability
applications can perform automated processes based on particular parameters. For more
information about how these properties are used, see "Writing Events to an Event Log" later in
this chapter.
Typically, you should not call the methods of the EventLogInstaller class from within your code;
they are generally called only by the InstallUtil.exe installation utility. The utility automatically
calls the Install method during the installation process. It backs out failures, if necessary, by
calling the Rollback method for the object that generated the exception.
The following code example shows EventLogInstaller.
C#
using System;
using System.Management.Instrumentation;
using System.ComponentModel;
using System.Diagnostics;
using System.Configuration.Install;
using System.IO;
using System.Text;

namespace EventLogEvents.InstrumentationTechnology
{
[RunInstaller(true)]
public class EventLogEventsInstaller : Installer
{
// constructor
public EventLogEventsInstaller()
{

// Installer for events with source name: PS


EventLogInstaller myEventLogInstallerPS = new EventLogInstaller();
string resourceFilePS = Path.Combine(Environment.CurrentDirectory,
"EventMessagesPS.dll");
myEventLogInstallerPS.Source = "PS";
myEventLogInstallerPS.Log = "Application";
myEventLogInstallerPS.CategoryCount = 0;
myEventLogInstallerPS.CategoryResourceFile = resourceFilePS;
myEventLogInstallerPS.MessageResourceFile = resourceFilePS;
Installers.Add(myEventLogInstallerPS);

// Installer for events with source name: SS


EventLogInstaller myEventLogInstallerSS = new EventLogInstaller();
string resourceFileSS = Path.Combine(Environment.CurrentDirectory,
"EventMessagesSS.dll");
myEventLogInstallerSS.Source = "SS";
myEventLogInstallerSS.Log = "Application";
myEventLogInstallerSS.CategoryCount = 0;
myEventLogInstallerSS.CategoryResourceFile = resourceFileSS;
myEventLogInstallerSS.MessageResourceFile = resourceFileSS;
Installers.Add(myEventLogInstallerSS);

// Installer for events with source name: TS


EventLogInstaller myEventLogInstallerTS = new EventLogInstaller();
string resourceFileTS = Path.Combine(Environment.CurrentDirectory,
"EventMessagesTS.dll");
myEventLogInstallerTS.Source = "TS";
myEventLogInstallerTS.Log = "Application";
myEventLogInstallerTS.CategoryCount = 0;
myEventLogInstallerTS.CategoryResourceFile = resourceFileTS;
myEventLogInstallerTS.MessageResourceFile = resourceFileTS;
Installers.Add(myEventLogInstallerTS);

// Installer for events with source name: WSTransport


EventLogInstaller myEventLogInstallerWSTransport = new
EventLogInstaller();
string resourceFileWSTransport =
Path.Combine(Environment.CurrentDirectory, "EventMessagesWSTransport.dll");
myEventLogInstallerWSTransport.Source = "WSTransport";
myEventLogInstallerWSTransport.Log = "Application";
myEventLogInstallerWSTransport.CategoryCount = 0;
myEventLogInstallerWSTransport.CategoryResourceFile =
resourceFileWSTransport;
myEventLogInstallerWSTransport.MessageResourceFile =
resourceFileWSTransport;
Installers.Add(myEventLogInstallerWSTransport);

}
}
}

Writing Events to an Event Log


After the event log functionality is installed, you can write events to the event log. You have two
choices in writing events to an event log:
• WriteEntry method
• WriteEvent method

The next sections describe each of these methods.

Using the WriteEntry Method


After the EventLog component is appropriately configured, you can use the WriteEntry
overloaded method to write the event to the appropriate event log.
The following code shows one of the overloads used.
C#
byte[] myByte=new byte[10];
for(int i=0;i<10;i++)
{
myByte[i]= (byte)(i % 2);
}
// Write an informational entry to the event log.
Console.WriteLine("Write from second source ");
EventLog1.WriteEntry("SecondSource","Writing warning to event log.",
EventLogEntryType.Error,myEventID ,myCategory ,myByte);

The WriteEvent Method


The WriteEvent method is a more flexible alternative to the WriteEntry method. This method is
used in the instrumentation helpers generated by the TSMMD tool.
The WriteEvent method can be used to write a localized entry with additional event-specific
data to the event log, using a source already registered as an event source for the appropriate
log. You specify the event properties with resource identifiers rather than string values.
The Event Viewer uses the resource identifiers to display the corresponding strings from the
localized resource file for the source. You must register the source with the corresponding
resource file before you write events using resource identifiers.
The instance input specifies the event message and properties. You should set the InstanceId of
the instance input for the defined message in the source message resource file. Optionally, you
can set the CategoryId and EntryType of the instance input to define the category and event
type of your event entry. You can also specify an array of language-independent strings to insert
into the localized message text.
The following code shows the WriteEvent method.
C#
protected override void DoRaisePickupServiceSOAPError(string errorMessage)
{
string source = "SS";
string logName = "Application";
string machineName = ".";
long eventId = 2003;
int categoryId = 0;
Object[] values = new Object[1];
values[0] = errorMessage;
EventLogEntryType entryType = EventLogEntryType.Error;
EventLog eventLog = new EventLog();
eventLog.Source = source;
eventLog.Log = logName;
eventLog.MachineName = machineName;
EventInstance eventInstance = new EventInstance(eventId, categoryId);
eventInstance.EntryType = entryType;

eventLog.WriteEvent(eventInstance, values);
}
Set values to a null reference if the event message does not contain formatting placeholders
for replacement strings.

You can specify binary data with an event when it is necessary to provide additional details for
the event. For example, use the data parameter to include information about a specific error.
The Event Viewer does not interpret the associated event data; it displays the data in a
combined hexadecimal and text format. You should use event-specific data sparingly; include it
only if you are sure it will be useful. You can also use event-specific data to store information the
application can process independently of the Event Viewer.
The specified source must be registered for an event log before using WriteEvent. The specified
source must be configured for writing localized entries to the log; the source must at minimum
have a message resource file defined.

If your application writes entries using both resource identifiers and string values, you must
register two separate sources. For example, configure one source with resource files, and then
use that source in the WriteEvent method to write entries using resource identifiers to the
event log. Then create a different source without resource files, and use that source in the
WriteEntry method to write strings directly to the event log using that source.

Reading Events from Event Logs


It is not necessary to use the EventLogInstaller class to read events from event logs. Instead, you
should perform the following high level tasks:
1. Create and configure an instance of the EventLog class.
2. Use the Entries collection to read the entries in the log.

Reading events from event logs is not included in the functionality of the instrumentation
helper classes automatically generated by the TSMMD tool.

You should treat the data from an event log as you would any other input coming from outside
your system. Your application may need to validate the data in the event log before using it as
input. Another process, possibly a malicious one, may have accessed the event log and added
entries.

Creating and Configuring an Instance of the EventLog Class


An instance of the EventLog class defined in the following code.
C#
EventLog eventLog = new EventLog();
There are three major properties involved in configuring an instance of the EventLog class:

• Log. This property indicates the log with which you want to interact.
• MachineName. This property indicates the computer on which the log you resides.
• Source. This property indicates the source string that will be used to identify your
component when it writes entries to a log. In this case, you are reading from a log, so
you do not need to specify this property.

To read from an event log, you must specify the Log and MachineName properties, so that the
component is aware of which log to read from. The following code shows the Log and
MachineName properties specified.
C#
eventLog.Source = source;
eventLog.Log = logName;
eventLog.MachineName = machineName;

Dim log, machine As String


...
Dim EventLog1 As New EventLog
EventLog1.Log = log
EventLog1.MachineName = machine

Using the Entries Collection to Read the Entries


You use the Entries collection to look at the entries in a particular event log. You can use
standard collection properties such as Count and Item to work with the elements the collection
contains. You might read event log entries to learn more information about a problem that
occurred in your system, to identify usage patterns, or to identify problems (such as a failing
hard drive) before they cause damage.

The Entries collection is read-only, so it cannot be used to write to the event log.

The following example shows how to retrieve all of the entries from a log.
C#
foreach (System.Diagnostics.EventLogEntry entry in EventLog1.Entries)
{
Console.WriteLine(entry.Message);
}

If you ask for the count of entries in a new custom log that has not yet been written to, the
system returns the count of the entries in the Application log on that server. To avoid this
problem, make sure that logs you are counting have been created and written to.
Clearing Event Logs
Event logs are set to a maximum size that determines how many entries each log can contain.
When an event log is full, it either stops recording entries or begins overwriting the oldest
entries with new entries, depending on the settings specified in the Windows Event Viewer. In
either case, you can clear the log of its existing entries to free the log and allow it to start
recording events again. You must have Administrator rights to the computer on which the log
resides in order to clear entries.

Clearing event logs is not included in the functionality of the instrumentation helper
automatically generated by the TSMMD tool.

By default, the Application log, System log, and Security log are set to a default maximum size of
4992 K. Custom logs are set to a default maximum of 512 K.

You can also use the Windows Event Viewer to free up space on a log that has become full.
You can set the log to overwrite existing events, you can write log entries to an external file, or
you can increase the maximum size of the log. However, you cannot remove only some of the
entries in a log; when you clear a log, you remove all of its contents. For more information, see
"How to: Launch Event Viewer" on MSDN or your Event Viewer documentation.

You use the Clear method to clear the contents of an event log. The following code is used to
clear the events from EventLog1.
C#
EventLog1.Clear();

Deleting Event Logs


You can delete any event log on your local computer or a remote server if you have the
appropriate registry rights. When you delete a log, the system first deletes the file that contains
the log's contents and then accesses the registry and removes the registration for all of the
event sources that were registered for that log. Even if you re-create the log at a later point, this
process will not create the sources by default, so some applications that previously were able to
write entries to that log may not be able to write to the new log.

Deleting event logs is not included in the functionality of the instrumentation helper
automatically generated by the TSMMD tool.
To delete an event log, you should use the Delete method and specify the name of the log you
want to delete. The Delete method is static, so you do not need to create an instance of the
EventLog component before you call the method—instead, you can call the method on the
EventLog class itself, as shown in the following code.
C#
System.Diagnostics.EventLog.Delete ("MyCustomLog");

Re-creating an event log can be a difficult process. It is good practice to not delete any of the
system-created event logs, such as the Application log. You can delete your custom logs and
re-create them as needed.

The following code shows an example of verifying a source and deleting a log if the source
exists. This code assumes that an Imports or Using statement exists for the System.Diagnostics
namespace.
C#
if (System.Diagnostics.EventLog.Exists("MyCustomLog"))
{
System.Diagnostics.EventLog.Delete("MyCustomLog");
}

Removing Event Sources


You can remove your source if you no longer need to use it to write entries to that log. Doing
this affects all components that used that source to write to the log. For example, if you have
two Web services that write to a log using the source name "mysource," removing "mysource"
as a valid source of events affects both Web services.

Removing event sources is not included in the functionality of the instrumentation helper
automatically generated by the TSMMD tool.

To remove an event source, you should call the DeleteEventSource method, specifying the
source name to remove. The following code shows an event source named MyApp1 being
removed from the local computer.
C#
System.Diagnostics.EventLog.DeleteEventSource("MyApp1");

The following code removes an event source from a remote computer.


C#
System.Diagnostics.EventLog.DeleteEventSource("MyApp1", "myserver");
Removing a source does not remove the entries that were written to that log using this source.
However, it does affect the entries by adding information to them indicating that the source
cannot be found.

Creating Event Handlers


You can create event handlers for your EventLog components. These can be used to determine
when an event has been raised. Notifications can then be raised, or code can be run to
automatically correct a problem.

The instrumentation helper automatically generated by the TSMMD tool does not create event
handlers.

To programmatically create a handler


1. Use the AddHandler method to create an event handler of type
EventLogEventHandler for your component that will call the
EventLog1.EntryWritten procedure when an entry is written to the log. Your code
should look like the following.
this.eventLog1.EntryWritten += new
System.Diagnostics.EntryWrittenEventHandler(
this.eventLog1_EntryWritten);

For more information about this syntax, see "Event Handlers in Visual Basic and Visual
C#" on MSDN at http://msdn2.microsoft.com/en-us/library/aa984105(VS.71).aspx.

2. Create the EntryWritten procedure and define the code you want to process the
entries.
3. Set the EnableRaisingEvents property to true.

Using Custom Event Logs


You should use a custom log if you want to organize events in a more granular way than is
allowed when your components write entries to the default Application log. For example,
suppose you have a component named OrderEntry that writes events to an event log. You are
interested in backing up and saving these entries for a longer period of time than some other
entries in the Application log. Instead of registering your component to write to the Application
log, you can create a custom log named OrdersLog and register your component to write entries
to that log instead. That way, all of your order information is stored in one place and will not be
affected if the entries in the Application log are cleared.
You may also use custom event logs in situations where you do not have rights to write to the
Application event log.

Writing to a Custom Log


Typically, writing to a custom log consists of two high level tasks:

• Installing the custom log (only necessary if the log does not already exist)
• Writing events to the custom log

The next sections describe these tasks in more detail.

Installing the Custom Log


You can use the EventLogInstaller class to create a custom log. In this case, you specify the log
property to be a log that does not already exist. In this case, the system automatically creates a
custom log for you and registers an event source for that log.
The following example shows how to create a custom log named MyNewLog on the local
computer.
C#
using System;
using System.Management.Instrumentation;
using System.ComponentModel;
using System.Diagnostics;
using System.Configuration.Install;
using System.IO;
using System.Text;

namespace EventLogEvents.InstrumentationTechnology
{
[RunInstaller(true)]
public class EventLogEventsInstaller : Installer
{
// constructor
public EventLogEventsInstaller()
{

// Installer for events with source name: PS


EventLogInstaller myEventLogInstallerPS = new EventLogInstaller();
string resourceFilePS = Path.Combine(Environment.CurrentDirectory,
"EventMessagesPS.dll");
myEventLogInstallerPS.Source = "PS";
myEventLogInstallerPS.Log = "MyNewLog";
myEventLogInstallerPS.CategoryCount = 0;
myEventLogInstallerPS.CategoryResourceFile = resourceFilePS;
myEventLogInstallerPS.MessageResourceFile = resourceFilePS;
Writing Events to the Custom Log
Writing events to a custom log is the same process as writing events to any other log. For more
details, see "Writing Events to the Event Log" earlier in this chapter.

Other Custom Log Tasks


Other tasks you may perform with custom logs, such as reading from the log or clearing entries
from the log, are the same as performing the tasks on built-in logs. For more details, see the
corresponding sections earlier in this chapter.

Summary
This chapter has demonstrated many of the developer tasks associated with event log
instrumentation. Many of the developer tasks you will need to perform are automated by the
TSMMD tool. However, it is still important for developers to understand the work performed by
the TSMMD tool when developing instrumented application.
Chapter 10
WMI Instrumentation
Windows Management Instrumentation (WMI) is the Microsoft implementation of Web-based
Enterprise Management (WBEM), which is an industry initiative developed to standardize the
technology for managing enterprise computing environments. WMI uses classes based on the
Common Information Model (CIM) industry standard to represent systems, processes, networks,
devices, and other enterprise components.
WMI supplies a pre-installed class schema that allows scripts or applications written in scripting
languages, Visual Basic, or C++ to monitor and configure applications, system or network
components, and hardware in an enterprise. For example, instances of the Win32_Process class
represent all the processes on a computer, and the Win32_LogicalDisk class can represent any
disk devices. For more information, see "Win32 Classes" in the Windows Management
Instrumentation documentation in the MSDN Library at http://msdn.microsoft.com/library.
The WMI architecture consists of the following tiers:
• Client software components. These perform operations using WMI, such as reading
management details, configuring systems, and subscribing to events.
• Object manager. This is a broker between providers and clients that provides some key
services, such as standard event publication and subscription, event filtering, query
engine, and other services.
• Provider software components. These capture and return live data to the client
applications, process method invocations from the clients, and link the client to the
infrastructure being managed.

Not all the WMI instrumentation code described in this chapter is implemented in the
instrumentation helpers generated by the Team System Management Model Designer Power
Tool (TSMMD). However, it is still included in this chapter as it may be required.

WMI and the .NET Framework


WMI is the instrumentation standard used by management applications such as Microsoft
Operations Manager, Microsoft Application Center, and many third-party management tools.
The Windows operating system is instrumented with WMI, but developers who want to
generate instrumentation for their own applications must write their own instrumentation code.
WMI in the .NET Framework is built on the original WMI technology and allows the same
development of applications and providers with the advantages of programming in the .NET
Framework.
The classes in the System.Management.Instrumentation namespace allow managed code
developers to surface information to WMI-enabled tools. The goal in creating this namespace
was to minimize the work involved in enabling an application for management. The namespace
also makes it easy to expose events and data. Exposing an application's objects for management
should be intuitive for .NET Framework developers—the WMI schema is object-oriented and has
many traits in common with the .NET Framework metadata—code classes map to schema
classes, properties on code objects map to properties on WMI objects, and so on. Therefore, it is
easy to instrument managed code applications to provide manageability. Developers who are
already familiar with writing managed code have many of the skills required to provide
instrumentation through WMI. There is almost no learning curve,
You can expose application information for management by making declarations—no extensive
extra coding is required. The developer marks the objects as manageable by using the .NET
Framework classes and defines how they map to the corresponding WMI classes.
The developer can also derive the class from a common System.Management.Instrumentation
schema class, in which case the attribution and mapping is already done. The
InstrumentedAttribute and InstrumentationClassAttribute classes are the primary means of
instrumenting your code.
After your application is instrumented, other applications can discover, monitor, and configure
objects and events through WMI and the management applications developed by the extensive
WMI customer base (such as Computer Associates, Tivoli Systems, Inc., BMC Software, Hewlett-
Packard, and so on). The managed-code events marked for management are raised as WMI
events when the WMI event raising API is invoked.
Security support in System.Management is tightly linked to security in WMI. In WMI, client
access to information is controlled using namespace-based security. For more information, see
"Security for WMI in .NET Framework" on MSND at http://msdn2.microsoft.com/en-
us/library/ms186154.aspx.

Benefits of WMI Support in the .NET Framework


Writing a client application or provider using WMI support in the .NET Framework provides
several advantages over original WMI. In this case, writing a provider means adding
instrumentation to an application written in managed code.
WMI support in the .NET Framework offers the following advantages for writing client
applications and providers:

• Use of common language runtime features, such as garbage collection, custom


indexer, and dictionaries. It also offers other common language runtime features such
as automatic memory management, efficient deployment, an object-oriented
framework, evidence-based security, and exception handling.
• Definition of classes and publication of instances entirely with .NET Framework
classes to instrument applications so the applications can provide data to WMI. The
classes in System.Management.Instrumentation allow you to register a new provider,
create new classes, and publish instances without the developer having to use Managed
Object Format (MOF) code.
• Simplicity of use. Applications for WMI are sometimes difficult or lengthy to develop.
The class structure of System.Management brings more script-like simplicity to
applications developed in the .NET Framework. The development of both applications
and providers can be done more quickly with easier debugging.
• Access to all WMI data. Client applications have the same access to, and can do all the
same operations with WMI data as in the original WMI. Provider-instrumented
applications are somewhat more restricted. For more information, see "Limitations of
WMI in .NET Framework" on MSDN at http://msdn2.microsoft.com/en-
us/library/ms186136.aspx.

Limitations of WMI in the .NET Framework


An instrumented application can only exist as a decoupled provider, out of process to WMI.
Objects exposed through native WMI providers can still expose these features, which are
accessible from managed code through System.Management classes. A client application can
still do most of the original WMI client operations.
You will encounter most of the limitations of WMI in .NET Framework when writing provider-
instrumented applications. The limitations include the following:

• Managed code providers cannot define methods. Instrumented applications running


on the .NET Framework and providing data to WMI cannot use the
System.Management or System.Management.Instrumentation class methods to
define and implement WMI methods. A client application can still invoke the method of
an original WMI provider.
• Instrumented applications cannot expose writeable properties on new classes that
are not wrappers of underlying unmanaged WMI classes. A client of a WMI class
exposed by an instrumented managed application cannot change instance data and
then write the data back using a Put operation.
• You cannot create qualifiers on instrumented classes. Instead, managed code defines
several operative attributes in System.Management.Instrumentation that indicates
how the mapping between WMI classes and managed code classes is performed.
• You cannot define properties of instrumented objects as key properties.
• Although WMI supports embedded objects as well as references to other objects
using WMI in .NET Framework, you can only use embedded objects when defining
new classes.
• You cannot create an event consumer provider in managed code. For more
information, see "Writing an Event Consumer Provider" in the Windows Management
Instrumentation documentation in the MSDN Library at
http://msdn.microsoft.com/library. However, managed client applications can still
access existing unmanaged code and WMI consumer providers, such as the Standard
Consumers. For more information, see "Monitoring and Responding to Events with
Standard Consumers" in the Windows Management Instrumentation documentation in
the MSDN Library at http://msdn.microsoft.com/library.
• WMI in .NET Framework does not support refreshers. If you want to retrieve data from
Win32_FormattedData_* classes, you can use the
System.Diagnostics.PerformanceCounter class instead of using refreshers with the
Win32_FormattedData_* classes, or you can get the raw counter samples from the
Win32_PerfRawData_* classes at the desired interval and calculate the result yourself
using the last two samples. For more information about these Win32 classes, see
"Win32_Classes" in the Windows Management Instrumentation documentation in the
MSDN Library at http://msdn.microsoft.com/library.
• The System.Management.Instrumentation namespace does not support the
inheritance of classes if the derived class is in a different namespace than the parent
class.
• The WMI infrastructure and providers on native and managed (.NET) stacks have not
been verified for use in a cluster environment. This means that the WMI infrastructure
and providers are not supported by Microsoft in a cluster environment.

Using WMI.NET Namespaces


WMI organizes its reinstalled classes into namespaces. You should use the following
recommendations when defining WMI namespaces:
• As a convenience during development and if it is not otherwise specified by the
InstrumentedAttribute class on the assembly, instrumentation data is published to the
root\default namespace. However, you should normally override this default and define
a specific namespace for their application, so it can be managed independently.
• Create a separate namespace for your particular assembly, group of assemblies, or
application, having similar security requirements. Use the company name and software
product name in your namespace definition, to ensure uniqueness. For example,
instrumentation from your application can be published into the root\<your company
name>\<your product name> namespace. Potentially, the namespace hierarchy can
also contain version information (see more about versioning in the schema registration
section).

Administrators can use WMI Control to specify security constraints for a specific namespace. For
more information, see "Locating the WMI Control" in the Windows Management
Instrumentation documentation in the MSDN Library at http://msdn.microsoft.com/library.

The WMI namespaces, such as root\cimv2 and root\default, are not to be confused with the
.NET Framework namespaces System.Management and
System.Management.Instrumentation. The System.Management namespace contains the
WMI in .NET Framework classes to perform WMI operations. The
System.Management.Instrumentation namespace contains the classes for adding
instrumentation to your application.
Administrators and IT developers can use the classes in System.Management to write
applications that access WMI data in any .NET Framework language, such as C#, Visual Basic
.NET, or J#. These applications can do the following:
• Enumerate or retrieve a collection of instance property data, such as the FreeSpace
property of all the instances of Win32_LogicalDisk on all the computers of a network.
For more information, see "Win32_LogicalDisk" in the Windows Management
Instrumentation documentation in the MSDN Library at
http://msdn.microsoft.com/library.
• Query for selected instance data. WMI in .NET Framework uses the original WMI WQL
query language, a subset of SQL. For more information on WQL, see "WQL query
language" in the Windows Management Instrumentation documentation in the MSDN
Library at http://msdn.microsoft.com/library.
• Subscribe to events, defined as instances of event classes. An event occurs when an
instrumented application (provider) creates an instance of one of its event classes.

Publishing the Schema for an Instrumented Assembly to WMI


An instrumented application must undergo a registration stage, in which its schema can be
registered in the WMI repository. Schema publishing is required on a per assembly basis. Any
assembly that declares instrumentation types (events or instances) must have its schema
published to WMI. This is done using the standard installer mechanisms in the .NET Framework.

As a convenience for developers at design time, the schema is automatically published the first
time an application raises an event or publishes an instance. This avoids having to declare a
project installer and running the InstallUtil.exe tool during rapid prototyping of an application.
However, this registration will succeed only if the user invoking it is a member of the Local
Administrators group, so you should not rely on this as a mechanism for publishing the
schema.

The event (or instance) class schema resides in the assembly and is registered in the WMI
repository during installation.
To publish a schema to WMI, you must first define an installer for the project. You can use the
ManagementInstaller class provided in the System.Management.Instrumentation namespace.
For example, you would add the following code to your project installer's constructor.
C#
[RunInstaller(true)]
public class WmiEventsInstaller : DefaultManagementProjectInstaller
{
// constructor
public WmiEventsInstaller()
{
}
}

Typically, you should not call the methods of the ManagementInstaller class from within your
code; they are generally called only by the InstallUtil.exe installation utility. The utility
automatically calls the Install method during the installation process. It backs out failures, if
necessary, by calling the Rollback method for the object that generated the exception.

Republishing the Schema


In some cases, you will make changes to an application, and it will need to be reinstalled. In this
case, you should perform the following actions:
• In case of schema changes, re-install the assembly.
• Re-install the assemblies for all classes derived from the changed class (if there are any).
• Re-compile client applications.

If the currently registered schema becomes corrupted for any reason, there might be cases in
which re-running InstallUtil.exe will not detect the need to re-register the original schema. In
this case, it is possible to force the installer to re-install the schema using the /f or /force switch.

It is not always necessary to recompile the client application when the schema changes. If the
event schema has been changed by adding properties and methods, and none of the earlier
defined properties or methods were removed, you can move the application's instrumentation
to a different WMI namespace, and not recompile the client application.

Unregistering the Schema


The ManagementInstaller class does not perform any operations at uninstall; specifically, it
does not unregister the schema. The reason is that more than one WMI provider could use the
same schema, and there is no mechanism in place for identifying whether a particular schema is
not being used by any other entity and can be safely removed. If you need to unregister the
schema, you can use the WBEMTest utility.

Instrumenting Applications Using WMI.NET classes


Developers can use the classes in System.Management.Instrumentation to instrument their
application so that it provides data to WMI about the behavior of the application.
Instrumenting an application involves defining classes and then setting attributes on those
classes to designate them for instrumentation. The running application creates instances of
those classes and publishes them to WMI using the services provided by WMI.NET classes. For
example, your application may expose data about its health. An instrumented application is a
provider of data to WMI in the same way that providers work in the original WMI.
WMI .NET Classes
The following tables list the main classes that must be implemented for each of the specified
task areas. Where relevant, the associated interfaces and configuration elements are also listed.
This is not a comprehensive list of all the classes in each namespace, but it includes all classes
demonstrated in the How-to topics.
Classes in the System.Management Namespace
Technology Area Classes/interfaces/configuration elements

Gathering WMI class information ManagementObject, ManagementClass

Querying for data SelectQuery, ManagementObjectSearcher, WqlObjectQuery,


Querying for data asynchronously ObjectQuery
ManagementObjectCollection, ManagementOperationObserver

Executing methods ManagementBaseObject


Executing methods asynchronously ManagementOperationObserver

Receiving events WqlEventQuery, ManagementEventWatcher


Receiving events asynchronously EventArrivedEventArgs, EventArrivedEventHandler,
CompletedEventArgs, CompletedEventHandler

Connecting to a remote computer ConnectionOptions, ManagementScope

Classes in the System.Management.Instrumentation Namespace


Technology Area Classes/interfaces/configuration elements

Creating data providers Instance, InstrumentationClassAttribute, InstrumentedAttribute

Creating event providers BaseEvent, Instrumentation

Registering a provider ManagementInstaller

Accessing WMI Data Programmatically


You can create queries for WMI data in the .NET Framework, specified as a string in the WMI
supported WQL format, or constructed using a query class from the System.Management
namespace. The WqlEventQuery class is used for event queries, and the WqlObjectQuery class
is used for data queries.

The instrumentation helpers automatically generated by the TSMMD tool do not perform
queries for WMI data.

Creating targeted queries can noticeably increase the speed with which data is returned, and
make it easier to work with the returned data. Targeted queries can also cut down on the
amount of data that is returned, an important consideration for scripts that run over the
network.
The following code example shows how a query can be invoked using the
ManagementObjectSearcher class. In this case, the SelectQuery class is used to specify a
request for environment variables under the System user name. The query returns results in a
collection.
C#
using System;
using System.Management;

// This example demonstrates how to perform an object query.

public class QueryInstances {


public static int Main(string[] args) {
// Create a query for system environment variables only.
SelectQuery query = new SelectQuery("Win32_Environment",
"UserName=\"<SYSTEM>\"");

// Initialize an object searcher with this query.


ManagementObjectSearcher searcher = new ManagementObjectSearcher(query);

// Get the resulting collection and loop through it.


foreach (ManagementObject envVar in searcher.Get()) {
Console.WriteLine("System environment variable {0} = {1}",
envVar["Name"], envVar["VariableValue"]);
}
return 0;
}
}

The preceding example requires references to the System and System.Management


namespaces.

Summary
This chapter has demonstrated many of the developer tasks associated with WMI
instrumentation. Most of the developer tasks you will need to perform are automated by the
TSMMD tool. However, it is still important for developers to understand the work performed by
the TSMMD tool when developing applications with WMI instrumentation.
Chapter 11
Windows Eventing 6.0
Instrumentation
In versions of the Windows operating system earlier than Windows Vista, you would use either
Event Tracing for Windows (ETW) or event logging to log events. Windows Vista introduces a
new Eventing model that unifies both the ETW and Windows Event Log API.
The new model uses an XML manifest to define the events that you want to publish. Events can
be published to a channel or an ETW session. You can publish the events to the following types
of channels:
• Admin
• Operational
• Analytic
• Debug

This chapter provides an introduction to the Windows Eventing 6.0 mechanism, and describes
the tasks that must be performed when developing Eventing 6.0 instrumentation. For more
information about Windows Event Log, see "Windows Event Log" on MSDN
(http://msdn.microsoft.com/en-us/library/aa385780(VS.85).aspx).

Windows Eventing 6.0 Overview


Windows Vista and future versions of Windows Server incorporate an enhanced, XML-based
Event Log and the associated administrative tools. The new Event Log format allows events to
include additional fields for keywords, activity correlation between machines, and a link for
additional information. Events can be persisted into a hierarchical log structure, forwarded
between machines, and linked to scheduled tasks. Of course, backwards compatibility with
traditional Windows events is also supported.

Reusable Custom Views


A new Event Viewer provides graphical access to this richer event information (see Figure 1).
The new viewer provides capabilities for creating rule-based filters, searching across multiple
logs, and attaching tasks to specific events. Events can be displayed as XML, and administrators
can create custom filters using XPath queries. Custom views can also be exported for use on
other computers, or shared with other administrators.
Figure 1
Windows Event Viewer in Windows Vista and Windows Server 2008

Command Line Operations


The new WevtUtil.exe utility is used to access event log information from the command line. It
supports the following command parameters:

Command parameter Description

al (archive-log) Archive an exported log.

cl (clear-log) Clear a log.

el (enum-logs) List log names.

ep (enum-publishers) List event publishers.

epl (export-log) Export a log.

gl (get-log) Get log configuration information.


gli (get-log-info) Get log status information.

gp (get-publisher) Get publisher configuration information.

im (install-manifest) Install event publishers and logs from manifest.

qe (query-events) Query events from a log or log file.

sl (set-log) Modify configuration of a log.

um (uninstall-manifest) Uninstall event publishers and logs from manifest.

Administrators can execute scripts against the Event Log if Windows PowerShell is installed.

Event Subscriptions
An instance of the Event Viewer enables administrators to view events on a single local or
remote computer. However, some troubleshooting scenarios may involve examining filtered
events stored in logs on multiple computers. The new Windows Evening system includes the
ability to forward copies of events from multiple remote computers and collect them on a single
computer.
Administrators create event subscriptions to exactly specify which events will be collected, and
in which log they will be stored. Forwarded events can be viewed and manipulated as with any
other local events. Event subscriptions require configuring the Windows Remote Management
(WinRM) service and the Windows Event Collector (Wecsvc) service on participating forwarding
and collecting computers.

Integration with Task Scheduler


Using the new Event Viewer, administrators have the ability to associate configurable tasks with
specific events (see Figure 2). The new Task Scheduler manages events created through the new
Event Viewer.
Figure 2
Specifying an Edit Trigger for a Windows Eventing 6.0 event

Online Event Information


Ideally, an event should contain enough context information to allow an administrator to
diagnose, correct, and verify exceptional conditions. Often this diagnostic information is not
known during development, and it is necessary to put a pointer to an external resource into the
event. Classic windows events often include this link in the message body, but there is no
standard mechanism for handling this.
The new Windows Event format includes a Uniform Resource Locator (URL) link to a web site
(Microsoft or developer defined) that allows administrators to jump quickly to a link with
additional, up-to-date information. URLs can be defined either in the Windows registry, or in an
instrumentation manifest. Registry-based URLs are common across all published events on the
computer where the registry is located. Registry-based URLs also override any URLs defined in
instrumentation manifests.

Publishing Windows Events


Publishing Windows Events consists of six high level steps:
1. Decide the type of event to raise and where to publish the events (that is, which
channel).
2. Define the publisher, events, channels, and metadata in an instrumentation
manifest.
3. Execute the Message Compiler utility (mc.exe) against the manifest to generate a
header and binary resource files.
4. Write code to raise the events.
5. Compile and link your event publisher source code.
6. Install the publisher files.

The next sections describe each of these steps in more detail:

Event Types and Event Channels


A channel is a named stream of events that transports events from an event publisher to an
event log file, where an event consumer can get an event. Event channels are intended for
specific audiences and have different types for each audience.
While most channels are tied to specific event publishers (they are created when publishers are
installed and deleted when publishers are uninstalled), there are a few channels that are
independent from any event publisher. System Event Log channels and event logs, such as
System, Application, and Security, are installed with the operating system and cannot be
deleted.
A channel can be defined on any independent Event Tracing for Windows (ETW) session. Such
channels are not controlled by Windows Event Log; they are controlled by the ETW consumer
that creates them.
Channels defined by event publishers are identified by a name and should be based on the
publisher name.

There are restrictions on channel naming. Channel names can contain spaces, but a channel
name cannot be longer than 255 characters, and cannot contain '>', '<', '&', '"', '|', '\', ':', '`', '?',
'*', or characters with codes less than 31. Additionally, the name must follow the general
constraints on file and registry key names.

The following XML example shows a valid default channel name.


XML
Company-Product-Component/ChannelName

Event Types and Channel Groups


Event types and channel types can be considered the same thing because the type of channel
defines the type of event that travels through the channel to an event log. Each channel group
contains two event (or channel) types: serviced and direct.
Serviced Channel
You can subscribe to a serviced channel in addition to querying the channel. The event
consumer subscriptions to a serviced channel are based on XPath queries; thus, only events that
match the query are delivered to the subscribers. Events in the serviced channel can be
forwarded to another system. Forwarding is subscription-based and selected events can be
forwarded from any number of channels. Serviced channels have the following types:
• Administrative. These events are primarily targeted at administrators and support staff,
and indicate a serious fault or issue that requires intervention. These types of event
should be relatively infrequent, and indicate a problem and a well-defined solution that
an administrator can act on. An example of an admin event is an event that occurs
when an application fails to connect to a printer. These events are either well-
documented or have a message associated with them that gives the reader direct
instructions of what must be done to rectify the problem.
• Operational. Operational events are used for more mundane tasks such as reporting
status, or to assist in analyzing and diagnosing a problem or occurrence. They can be
used to trigger tools or tasks based on the problem or occurrence. An example of an
operational event is an event that occurs when a printer is added or removed from a
system.

Direct Channel
You cannot subscribe to a direct channel, but you can query a direct channel. A direct channel is
performance-oriented. Events are not processed in any way by the eventing system. This allows
the direct channel to support high volumes of events. Direct channels have the following types:

• Analytic. Analytic events are published in high volume. They describe program
operation and indicate problems that cannot be handled by user intervention.
• Debug. Debug events are used solely by developers to diagnose a problem for
debugging.

Channels Defined in the Winmeta.xml File


Some channels are already defined in the Winmeta.xml file that is included in the Windows SDK.
These channels can be imported using the importChannel element. The following table contains
a list of these channels.

Name Type Symbol Description

TraceClassic Debug WINEVENT_CHANNEL_CLASSIC_TRACE Events for classic ETW event


Value: 0 tracing.

System Admin WINEVENT_CHANNEL_GLOBAL_SYSTEM This channel is used by


Value: 8 applications running under system
service accounts (installed system
services), drivers, or a component
or application that has events that
relate to the health of the computer
system.

Application Admin WINEVENT_CHANNEL_GLOBAL_APPLICATION Events for all user-level


Value: 9 applications. This channel is not
secured and it is open to any
applications. Applications that log
extensive information should
define an application-specific
channel.

Security Admin WINEVENT_CHANNEL_GLOBAL_SECURITY The Windows Audit Log. This


Value: 10 event log is for exclusive use of the
Windows Local Security Authority.
User events may appear as audits
if supported by the underlying
application.

Creating the Instrumentation Manifests


Instrumentation manifests contain the event publisher metadata, event definitions and
templates, channel definitions, and localized event messages.
Instrumentation manifests are created in a particular structure and can be validated against the
EventManifest schema.
The instrumentation manifest includes the following information:
• It includes the identity of the publisher and the location of the publisher's resources.
• It includes the definition and settings of any channels that the application creates. For
more information about channels, see "Event Logs and Channels in Windows Event Log"
on MSDN at http://msdn2.microsoft.com/en-us/library/aa385225.aspx.
• It includes the definition, XML shape, message text and destination channel of the
events reported by the publisher.
• It includes localized event messages.

Elements in the Instrumentation Manifest


Provider metadata and event information are found in the manifest in the following elements:

• <instrumentationManifest>. This is the top-level element in an instrumentation


manifest. This element contains the elements that configure event publishers, create
and configure new channels, disclose what events a publisher is planning to publish
(and into which channels the events are published), and provide localized strings to be
used in event rendering (displaying the event message).
• <instrumentation>. This contains the elements that configure event publishers and
disclose what events a publisher is planning to publish. This element contains a list of all
the publishers in the manifest.
• <events> (parent element: <instrumentation>). This defines a list of event publishers
that is defined in the manifest. This element also allows you to create a list of event
messages.
• <provider>. This contains provider metadata for an event publisher. The metadata
contains information such as the provider's name, channels that are used by the
provider, opcodes, and other data in the provider. For more information about the
metadata that can be defined, see "ProviderType Complex Type" on MSDN at
http://msdn2.microsoft.com/en-us/library/aa384018.aspx.
• <channels>. This contains the list of channels into which this provider publishes events.
Channels help define the event log into which the events will be "channeled." The
channels that are referenced in the event definitions must be declared in the manifest.
This allows you to obtain all the channels that are used in the manifest and by the event
publisher, and it allows system tools to verify that there are no spelling mistakes in the
channel names used in the event definitions. When a channel is referenced by an event
definition, the event will be published into this channel.
• <opcodes>. This contains the definitions of opcodes to be used by the events published
by this provider. For more information about opcodes, see "OpcodeType Complex
Type" on MSDN at http://msdn2.microsoft.com/en-us/library/aa383956.aspx.
• <keywords>. This contains the definitions of keywords to be used by the events
published by this provider. For more information about keywords, see "KeywordType
Complex Type" on MSDN at http://msdn2.microsoft.com/en-us/library/aa382786.aspx.
• <templates>. This contains the data-rendering templates used by the events published
by this provider. For more information about templates, see "Using Templates for
Events" later in this section.
• <events> (parent element: <provider>). This contains the definitions of the events
published by a provider. Each event has a 16-bit integer ID associated with it.
Additionally, each event has a set of classifiers present even when they are not explicitly
identified in the event definition (there are default values for all classifiers): task,
opcode, keywords, version, and level. The combination of the value and version of the
event uniquely identifies an event.
• <stringTable>. This specifies a list of event messages or references to strings in the
localization section of the manifest. The event message is a readable description of the
event. This description is localized. The message can also contain substitution
parameters (similar to template) that specify user supplied values from the event to
substitute into the message so that the full description suitable for display to the user
can be formed.

The following XML example shows how to use substitution parameters in event messages. A
printer name value can be substituted into the message (as it is the first parameter) during an
event.
XML
Print Spooler has failed to connect to %1 printer.
All further print jobs to this printer will fail.
Ping the printer to check if it is online.

The following XML example shows an instrumentation manifest with each of the preceding
elements defined.
XML
<!-- <?xml version="1.0" encoding="UTF-16"?> -->
<instrumentationManifest
xmlns="http://schemas.microsoft.com/win/2004/08/events">
<instrumentation xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:win="http://manifests.microsoft.com/win/2004/08/windows/events">

<events xmlns="http://schemas.microsoft.com/win/2004/08/events">
<!--Publisher Info -->
<provider name="Microsoft-Windows-EventLogSamplePublisher"
guid="{1db28f2e-8f80-4027-8c5a-a11f7f10f62d}"
symbol="MICROSOFT_SAMPLE_PUBLISHER"
resourceFileName="C:\temp\Publisher.exe"
messageFileName="C:\temp\Publisher.exe">

<!--Channels to which this Publisher can publish -->


<channels>
<!--Pre-Existing channel can be imported, but not required. -->
<importChannel chid="C1" name="Application"/>
<!--New Channel can be declared for this Publisher-->
<channel chid="MyChannel"
name="Microsoft-Windows-SamplePublisher/Operational"
type="Operational"
symbol="SAMPLE_PUBLISHER"
isolation="Application" enabled="true" />
</channels>

<!--Event Templates -->


<templates>
<template tid="MyEventTemplate">
<data name="Prop_UnicodeString" inType="win:UnicodeString" />
<data name="Prop_AnsiString" inType="win:AnsiString"
outtype="xs:string" />
<data name="Prop_Int8" inType="win:Int8" />
<data name="Prop_UInt8" inType="win:UInt8" />
<data name="Prop_Int16" inType="win:Int16" />
<data name="Prop_UInt16" inType="win:UInt16" />
<data name="Prop_Int32" inType="win:Int32" />
<data name="Prop_UInt32" inType="win:UInt32" />
<data name="Prop_Int64" inType="win:Int64" />
<data name="Prop_UInt64" inType="win:UInt64" />
<data name="Prop_Float" inType="win:Float" />
<data name="Prop_Double" inType="win:Double" />
<data name="Prop_Boolean" inType="win:Boolean" />
<data name="Prop_GUID" inType="win:GUID" />
<data name="Prop_Pointer" inType="win:Pointer" />
<data name="Prop_FILETIME" inType="win:FILETIME" />
<data name="Prop_SYSTEMTIME" inType="win:SYSTEMTIME" />
<data name="Prop_SID_Length" inType="win:UInt32" />
<data name="Prop_SID" inType="win:SID" length="Prop_SID_Length"/>
<data name="Prop_Binary" inType="win:Binary" length="11" />
<UserData>
<MyEvent2 xmlns="myNs">
<Prop_UnicodeString> %1 </Prop_UnicodeString>
<Prop_AnsiString> %2 </Prop_AnsiString>
<Prop_Int8> %3 </Prop_Int8>
<Prop_UInt8> %4 </Prop_UInt8>
<Prop_Int16> %5 </Prop_Int16>
<Prop_UInt16> %6 </Prop_UInt16>
<Prop_Int32> %7 </Prop_Int32>
<Prop_UInt32> %8 </Prop_UInt32>
<Prop_Int64> %9 </Prop_Int64>
<Prop_UInt64> %10 </Prop_UInt64>
<Prop_Float> %11 </Prop_Float>
<Prop_Double> %12 </Prop_Double>
<Prop_Boolean> %13 </Prop_Boolean>
<Prop_GUID> %14 </Prop_GUID>
<Prop_Pointer> %15 </Prop_Pointer>
<Prop_FILETIME> %16 </Prop_FILETIME>
<Prop_SYSTEMTIME> %17 </Prop_SYSTEMTIME>
<Prop_SID_Length> %18 </Prop_SID_Length>
<Prop_SID> %19 </Prop_SID>
<Prop_Binary> %20 </Prop_Binary>
</MyEvent2>
</UserData>
</template>

</templates>

<!--All the Events that can be published by this Publisher -->


<events>
<event value="1"
level="win:Informational"
template="MyEventTemplate"
opcode="win:Info"
channel="MyChannel"
symbol="PROCESS_INFO_EVENT"
message="$(string.Publisher.EventMessage)"/>
</events>

</provider>

</events>

</instrumentation>

<localization>
<resources culture="en-US">
<stringTable>
<!--This is how event data can be used as part of Message String -->
<string id="Publisher.EventMessage"
value="Prop_UnicodeString=%1;%n
Prop_AnsiString=%2;%n
Prop_Int8=%3;%n
Prop_UInt8=%4;%n
Prop_Int16=%5;%n
Prop_UInt16=%6;%n
Prop_Int32=%7;%n
Prop_UInt32=%8;%n
Prop_Int64=%9;%n
Prop_UInt64=%10;%n
Prop_Float=%11;%n
Prop_Double=%12;%n
Prop_Boolean=%13;%n
Prop_GUID=%14;%n
Prop_Pointer=%15;%n
Prop_FILETIME=%16;%n
Prop_SYSTEMTIME=%17;%n
Prop_SID_Length=%18;%n
Prop_SID=%19;%n
Prop_Binary=%20"/>
</stringTable>
</resources>
</localization>
</instrumentationManifest>

You can also create event descriptions in multiple languages, by adding the localized strings to
the localization element of the instrumentation manifest.

Using Templates for Events


Templates specify the names and the types of data that the event publisher supplies with an
event. Additionally, a template may specify the XML structure of the event (defined static
content of the event, and insertions for dynamic content of the event).
If an XML template is attached to the event, the event can be represented as an XML fragment.
Using XML, each event attribute value should be labeled as its semantic meaning. This allows
queries and analysis to be performed later.

Using the Message Compiler to produce development files


MC.exe is used to produce development files that are required for compiling the source files
that raise events. It creates the following files:
• .h. This is the header file that contains the definitions for the event provider, event
attributes, channels, and events. These values are referenced when creating a handle
for the provider and when publishing events.
• .rc. This is a resource compiler script that can be used to include the generated
resources. This script is included into the component's main resource file.
• .bin. This suffix is used for two different types of bin files (typically, multiple bin files are
created). The first type of bin file is a culture-independent resource that contains the
provider and event metadata. This is the template resource, which is signified by the
TEMP suffix of the base name of the file. The second type of bin file is a culture-
dependent (localizable) resource that contains a message table. This is the message
resource and its name by default starts with the MSG prefix that is followed by a
number. The number starts with one and is incremented for each additional language
defined in the manifest.

An event publisher application uses these files along with the Windows Event Log API to publish
events to an event channel.
If MC.exe is used on the instrumentation manifest shown in the previous section, the following
Publisher.h file is generated.
C++
// publisher.h

#pragma once
__declspec(selectany) GUID MICROSOFT_SAMPLE_PUBLISHER = {0x1db28f2e, 0x8f80,
0x4027, {0x8c, 0x5a,0xa1,0x1f,0x7f,0x10,0xf6,0x2d}};
#define SAMPLE_PUBLISHER 0x10
__declspec(selectany) EVENT_DESCRIPTOR PROCESS_INFO_EVENT = {0x1, 0x0, 0x10,
0x4, 0x0, 0x0, 0x8000000000000000};
#define MSG_Publisher_EventMessage 0x00000000L

// end of publisher.h

The Publisher.h header file contains an EVENT_DESCRIPTOR variable definition that was defined
in the instrumentation manifest. This variable will be used in the EventWrite function call to
publish the event.

Writing Code to Raise Events


The following code example shows code to raise events from an event publisher.
C++
// publisher.cpp

#include <windows.h>
#include <comdef.h>
#include <sddl.h>
#include <iostream>
#include <tchar.h>
#include <string>
#include <vector>
#include <evntprov.h> // ETW Publishing header
# pragma comment(lib, "advapi32.lib")
#include <winevt.h> // EventLog Header
# pragma comment(lib, "wevtapi.lib")
#include "publisher.h" // Header generated by mc.exe
// from manifest (publisher.man)
using namespace std;

void __cdecl wmain()


{
REGHANDLE hPublisher = NULL; //Handle to Publisher

wprintf(L"Publishing Event to Microsoft-Windows-


EventLogSamplePublisher/Operational Channel... \n");

// Register a Publisher
ULONG ulResult = EventRegister(
&MICROSOFT_SAMPLE_PUBLISHER, // provider guid
NULL, // callback; unused for now
NULL, // context
&hPublisher); // handle required to unregister

if ( ulResult != ERROR_SUCCESS)
{
wprintf(L"Publisher Registration Failed!. Error = 0x%x", ulResult);
return;
}

// EventData
std::vector<EVENT_DATA_DESCRIPTOR> EventDataDesc;
EVENT_DATA_DESCRIPTOR EvtData;

// inType="win:UnicodeString"
PWSTR pws = L"Sample Unicode string";
EventDataDescCreate(&EvtData, pws, ((ULONG)wcslen(pws)+1)*sizeof(WCHAR));
EventDataDesc.push_back( EvtData );

// inType="win:AnsiString"
CHAR * ps = "Sample ANSI string";
EventDataDescCreate(&EvtData, ps, ((ULONG)strlen(ps)+1)*sizeof(CHAR));
EventDataDesc.push_back( EvtData );

// inType="win:Int8"
INT8 i8 = 0x7F;
EventDataDescCreate(&EvtData, &i8, sizeof(i8));
EventDataDesc.push_back( EvtData );

// inType="win:UInt8"
UINT8 ui8 = 0xFF;
EventDataDescCreate(&EvtData, &ui8, sizeof(ui8));
EventDataDesc.push_back( EvtData );

// inType="win:Int16"
INT16 i16 = 0x7FFF;
EventDataDescCreate(&EvtData, &i16, sizeof(i16));
EventDataDesc.push_back( EvtData );

// inType="win:UInt16"
UINT16 ui16 = 0xFFFF;
EventDataDescCreate(&EvtData, &ui16, sizeof(ui16));
EventDataDesc.push_back( EvtData );

// inType="win:Int32"
INT32 i32 = 0x7FFFFFFF;
EventDataDescCreate(&EvtData, &i32, sizeof(i32));
EventDataDesc.push_back( EvtData );

// inType="win:UInt32"
UINT32 ui32 = 0xFFFFFFFF;
EventDataDescCreate(&EvtData, &ui32, sizeof(ui32));
EventDataDesc.push_back( EvtData );

// inType="win:Int64"
INT64 i64 = 0x7FFFFFFFFFFFFFFFi64;
EventDataDescCreate(&EvtData, &i64, sizeof(i64));
EventDataDesc.push_back( EvtData );

// inType="win:UInt64"
UINT64 ui64 = 0xFFFFFFFFFFFFFFFFui64;
EventDataDescCreate(&EvtData, &ui64, sizeof(ui64));
EventDataDesc.push_back( EvtData );

// inType="win:Float"
FLOAT f = -3.1415926e+23f;
EventDataDescCreate(&EvtData, &f, sizeof(f));
EventDataDesc.push_back( EvtData );

// inType="win:Double"
DOUBLE d = -2.7182818284590452353602874713527e-101;
EventDataDescCreate(&EvtData, &d, sizeof(d));
EventDataDesc.push_back( EvtData );

// inType="win:Boolean"
BOOL b = TRUE;
EventDataDescCreate(&EvtData, &b, sizeof(b));
EventDataDesc.push_back( EvtData );

// inType="win:GUID"
GUID guid;
EventDataDescCreate(&EvtData, &guid, sizeof(guid));
EventDataDesc.push_back( EvtData );

// inType="win:Pointer"
PVOID p = NULL;
EventDataDescCreate(&EvtData, &p, sizeof(p));
EventDataDesc.push_back( EvtData );

// inType="win:FILETIME"
SYSTEMTIME st;
FILETIME ft;

GetSystemTime(&st);
SystemTimeToFileTime(&st, &ft);
EventDataDescCreate(&EvtData, &ft, sizeof(ft));
EventDataDesc.push_back( EvtData );

// inType="win:SYSTEMTIME"
GetSystemTime(&st);
EventDataDescCreate(&EvtData, &st, sizeof(st));
EventDataDesc.push_back( EvtData );

// inType="win:SID"
PSID pSid = NULL;
ConvertStringSidToSidW(L"S-1-5-19", &pSid); // LocalService

UINT32 sidLength = GetLengthSid(pSid);


EventDataDescCreate(&EvtData, &sidLength, sizeof(sidLength));
EventDataDesc.push_back( EvtData );

EventDataDescCreate(&EvtData, pSid, GetLengthSid(pSid));


EventDataDesc.push_back( EvtData );

// inType="win:Binary"
// Note: if you change the size of this array you'll have to change the
// "length" attribute in the manifest too.
BYTE ab[] = {0,1,2,3,4,5,4,3,2,1,0};
EventDataDescCreate(&EvtData, ab, sizeof(ab));
EventDataDesc.push_back( EvtData );

if ( EventEnabled(hPublisher, &PROCESS_INFO_EVENT) )
{
ulResult = EventWrite(hPublisher,
&PROCESS_INFO_EVENT,
(ULONG)EventDataDesc.size(),
&EventDataDesc[0]
);
if (ulResult != ERROR_SUCCESS)
{
//Get Extended Error Information
wprintf(L"EvtWrite Failed. Not able to fire event. Error = 0x%x",
ulResult);
LocalFree(pSid);

// Close the Publisher Handle


EventUnregister(hPublisher);

return;
}
}
else {
wprintf(L"Disabled");
}
wprintf(L"Success\n");

LocalFree(pSid);

// Close the Publisher Handle


EventUnregister(hPublisher);
}

// end of publisher.cpp

Compiling and Linking Event Publisher Source Code


The resource script that is generated by the Message Compiler tool is included in the resource
script of the program built, and the result is compiled by the Resource Compiler (RC.exe) tool to
produce .res files. These files are then linked into a project binary during its link phase (using
CL.exe or Link.exe).
The commands for this step are as follows:
• rc.exe publisher.rc
• cl.exe publisher.cpp /link publisher.res

The Publisher.cpp file is shown in the preceding example (it includes the generated Publisher.h).
The Publisher.res file is the resource file generated from the Publisher.rc file.

Installing the Publisher Files


Publisher files, including the manifest, must be installed on the target system. You install the
manifest using the Wevtutil.exe utility:

wevtutil install-manifest publisher.man

This command is usually limited to members of the Administrators group and must be run with
elevated privileges. As a result, this step will typically occur when the application is installed.
Consuming Event Log Events
Eventing 6.0 includes a number of mechanisms for consuming event log events, such as
querying, reading, and subscribing. This section outlines those mechanisms.

Querying for Events


A user can query either over active event logs (a log is still maintained within the system) or over
an external event log that was previously exported from the system. Events can still be written
to the log while the user is querying it. A user can also query over event logs on a remote
computer or the local computer. The example at the end of this topic shows how to query for
events on a remote computer.
An event query can be created by using an XPath query or an XML-formatted query.

Querying Over Active Event Logs


A user queries over active event logs by specifying an event query and then obtaining a query
result set, which is used to enumerate the results. Note that the registering of an active log
query does not cause the system to return a snapshot of events at the time of the query.
Instead, the system generates the result set as the user traverses through it so that events that
are generated during the query are not lost.
The following C++ example shows how to query over active event logs using the EvtQuery
function to obtain a handle to the query result set for use later to enumerate through the result
set.
C++
EVT_HANDLE queryResult = EvtQuery (
NULL,
L"Application",
L"*",
EvtQueryChannelPath | EvtQueryReverseDirection );

if ( queryResult == NULL )
return GetLastError();

Querying Over External Files


To query over an exported event log file, .evt file, or .etl file, use the same function (EvtQuery
function) that is used when querying over active logs, but pass in a path to the external file and
the appropriate flags to the function. The query is executed on the external file.
Relative paths and environment variables cannot be used when specifying an exported event log
file. A Universal Naming Convention (UNC) path can be used to locate the file. Any relative path
and environment variable expansion needs to be done prior to making API calls, as shown in the
following C++ code example.
C++
EVT_HANDLE queryResult = EvtQuery (
NULL,
L"c:\\temp\\MyExportedLog.log",
L"*",
EvtQueryFilePath | EvtQueryForwardDirection );

if ( queryResult == NULL )
return GetLastError();

Reading Events from a Query Result Set


The process for obtaining events from a query result set is the same if the original source of the
events was an active log or an exported log. You obtain an enumeration object over the result
set and use its methods to retrieve the event instances. The system supports simple forward-
only navigation over the result set in direct logs and .evt files. Other logs can be read forward
and backward. Event instances can be fetched from the log files in batches to improve the
performance.
The following C++ example shows how to use the EvtNext function to obtain event instances
from the query result set. For efficiency reasons, it is recommended that the user specify a
batch size much greater than 1 for enumerating large result sets.
C++
const int BatchSize = 10;
DWORD numRead = 0;
EVT_HANDLE batch[BatchSize];

if (!EvtNext(queryResult, BatchSize, batch, -1, 0, &numRead))


return GetLastError();

for (int i=0; i < numRead; i++)


{

// Render event instance here

EvtClose(batch[i]);
}

Subscribing to Events
Subscribing to events involves the receiving of notifications when selected events are raised. To
select events for a subscription, an event query is applied to events that are logged in one or
more channels. For information about creating a query, see "Event Selection" on MSDN at
http://msdn2.microsoft.com/en-us/library/aa385231.aspx. Because a data stream is logged,
subscriptions can get events that occur during periods when the subscriber is not connected. A
subscriber does not miss events that occur during down times (computer startup or shutdown).
Not only can log subscribers get the live events that pass the subscription filter, they can also get
the events that occurred before they were connected. At the time the subscription starts, any
events in the log that match the subscription start criteria are queued first and then live events
are added to the queue as they occur.
The subscription start criteria includes the following:

• Subscribing to future events (events not currently in the event log)


• Subscribing to events since the oldest event in the log
• Subscribing to events since a bookmark (marking some other event)

If a subscriber wants to ensure that it never misses a record and does not get repeat events,
then the subscriber indicates the last record that it received, which is marked by a bookmark.
The starting criteria for a subscription is specified in the Flags parameter of the EvtSubscribe
function by passing in a value from the EVT_SUBSCRIBE_FLAGS enumeration.

Push Subscriptions
In the push subscription model, events are delivered asynchronously to the callback function
that is provided to the EvtSubscribe function.
The following C++ example shows how to set up a push subscription by passing a callback
function into the Callback parameter of the EvtSubscribe function. The example subscribes to all
the Level 2 events in the Application channel.
C++
#include <windows.h>
#include <iostream>

#include <winevt.h> // EventLog Header


# pragma comment(lib, "wevtapi.lib")

using namespace std;

// Callback to receive RealTime Events.


DWORD WINAPI SubscriptionCallBack(
EVT_SUBSCRIBE_NOTIFY_ACTION Action,
PVOID Context,
EVT_HANDLE Event );

void __cdecl wmain()


{
EVT_HANDLE hSub = NULL; // Handle to the event subscriber.
wchar_t *szChannel = L"Application"; // Channel.
wchar_t *szQuery = L"*[System/Level=2]"; // XPATH Query to specify which
// events to subscribe to.

wprintf(L"Subscribing to all level 2 events from the Application channel...


\n");
wprintf(L"NOTE: Hit 'Q' or 'q' to stop the event subscription\n");

// Register the subscription.


hSub = EvtSubscribe( NULL, // Session
NULL, // Used for pull subscriptions.
szChannel, // Channel.
szQuery, // XPath query.
NULL, // Bookmark.
NULL, // CallbackContext.
(EVT_SUBSCRIBE_CALLBACK) SubscriptionCallBack, // Callback.
EvtSubscribeToFutureEvents // Flags.
);

if( !hSub )
{
wprintf(L"Couldn't Subscribe to Events!. Error = 0x%x", GetLastError());
return;
}
else
{
// Keep listening for events until 'q' or 'Q' is hit.
WCHAR ch = L'0';
do
{
ch = _getwch();
ch = towupper( ch );
Sleep(100);
} while( ch != 'Q' );
}

// Close the subscriber handle.


EvtClose(hSub);

wprintf(L"Event Subscription Closed !\n");


}

/**********************************************************************
Function: CallBack
Description: This function is called by EventLog to deliver RealTime Events.
Once the event is received it is rendered to the console.
Return: DWORD is returned. 0 if succeeded, otherwise a Win32 errorcode.

***********************************************************************/
DWORD WINAPI SubscriptionCallBack(
EVT_SUBSCRIBE_NOTIFY_ACTION Action,
PVOID Context,
EVT_HANDLE Event )
{
WCHAR *pBuff = NULL;
DWORD dwBuffSize = 0;
DWORD dwBuffUsed = 0;
DWORD dwRes = 0;
DWORD dwPropertyCount = 0;

// Get the XML EventSize to allocate the buffer size.


BOOL bRet = EvtRender(
NULL, // Session.
Event, // HANDLE.
EvtRenderEventXml, // Flags.
dwBuffSize, // BufferSize.
pBuff, // Buffer.
&dwBuffUsed, // Buffersize that is used or required.
&dwPropertyCount);

if (!bRet)
{
dwRes = GetLastError();
if( dwRes == ERROR_INSUFFICIENT_BUFFER )
{
// Allocate the buffer size needed to for the XML event.
dwBuffSize = dwBuffUsed;
pBuff = new WCHAR[dwBuffSize/sizeof(WCHAR)];

// Get the Event XML


bRet = EvtRender(
NULL, // Session.
Event, // HANDLE.
EvtRenderEventXml, // Flags.
dwBuffSize, // BufferSize.
pBuff, // Buffer.
&dwBuffUsed, // Buffer size that is used or required.
&dwPropertyCount);

if( !bRet )
{
wprintf(L"Couldn't Render Events!. Error = 0x%x", GetLastError());
delete[] pBuff;
return dwRes;
}
}
}

// Display the Event XML on console


wprintf(L"The following Event is received : \n %s \n\n", pBuff);

// Cleanup
delete[] pBuff;

return dwRes;
}

Pull Subscriptions
The pull subscription model is used to control the delivery of events by allowing the caller to
decide when to get an event from the queue.
To create a pull model subscription, the caller must provide an event to the SignalEvent
argument in the EvtSubscribe function. The event that is provided in the SignalEvent argument
is set when the first event arrives in the queue. The event is also set when an event arrives after
the client has attempted to read an empty queue.
A client can wait on an event until it is set. After the event is set, the client can read the
subscription results using the EvtNext function until the EvtNext function fails because of an
empty queue (in which case the client can start waiting again).
In the pull subscription model, the user obtains an enumeration object over the result set and
uses its methods to retrieve the event instances.
The following C++ example shows how to subscribe to events from event log channels using a
pull subscription. It registers a subscriber by providing an XPATH query, and then if any events
are received, the event XML is displayed on console using EvtRender.
C++
#include <windows.h>
#include <wchar.h>
#include <winevt.h> // EventLog Header
# pragma comment(lib, "wevtapi.lib")

void __cdecl wmain()


{
// Channel.
PWSTR szChannel = L"Application";
// XPATH Query to specify which events to subscribe to.
PWSTR szQuery = L"*";

wprintf(L"Subscribing to all events from the Application channel... \n");

const int BatchSize = 10;


DWORD numRead = 0;
EVT_HANDLE batch[BatchSize];

HANDLE signalEvent = CreateEventW(NULL, false, false, NULL);


// Register the subscription.
EVT_HANDLE subscription = EvtSubscribe(
NULL, // Session
signalEvent, // Used for pull subscriptions.
szChannel, // Channel.
szQuery, // XPath query.
NULL, // Bookmark.
NULL, // CallbackContext.
NULL, // Callback.
EvtSubscribeToFutureEvents // Flags.
);

if( subscription == NULL )


{
wprintf(L"Couldn't subscribe to events. Error = 0x%x", GetLastError());
return;
}
else
{
DWORD result = ERROR_SUCCESS;
while( result == ERROR_SUCCESS )
{
if( EvtNext( subscription, BatchSize, batch, -1, 0, &numRead) )
{
// Do something with numRead event handles in the batch array.
// For example, render the events.

for( DWORD i=0; i < numRead; i++)


{
// Render the events in the array
WCHAR *pBuff = NULL;
DWORD dwBuffSize = 0;
DWORD dwBuffUsed = 0;
DWORD dwRes = 0;
DWORD dwPropertyCount = 0;

// Get the XML EventSize to allocate the buffer size.


BOOL bRet = EvtRender(
NULL, // Session.
batch[i], // EVT_HANDLE.
EvtRenderEventXml, // Flags.
dwBuffSize, // BufferSize.
pBuff, // Buffer.
&dwBuffUsed, // Buffer size used.
&dwPropertyCount);

if (!bRet)
{
dwRes = GetLastError();
if( dwRes == ERROR_INSUFFICIENT_BUFFER )
{
// Allocate the buffer size needed for the XML event.
dwBuffSize = dwBuffUsed;
pBuff = new WCHAR[dwBuffSize/sizeof(WCHAR)];

// Get the Event XML


bRet = EvtRender(
NULL, // Session.
batch[i], // EVT_HANDLE.
EvtRenderEventXml, // Flags.
dwBuffSize, // BufferSize.
pBuff, // Buffer.
&dwBuffUsed, // Buffer size used.
&dwPropertyCount);

if( !bRet )
{
wprintf(L"Couldn't render events. Error = 0x%x",
GetLastError());
delete[] pBuff;
// Close the remaining event handles for this batch.
for(DWORD j=i; j < numRead; j++)
{
EvtClose(batch[j]);
}
break;
}
}
}

// Display the event XML on console


wprintf(L"The following event is received : \n %s \n\n", pBuff);

// Cleanup
delete[] pBuff;
EvtClose(batch[i]);
}
}
else
{
DWORD waitResult = 0;
result = GetLastError();
if( result == ERROR_NO_MORE_ITEMS )
{
// Wait for the subscription results
waitResult = WaitForSingleObject( signalEvent, INFINITE );
if( waitResult == WAIT_OBJECT_0 )
{
result = ERROR_SUCCESS;
}
else
{
result = GetLastError();
break;
}
}
}
}
}

// Close the subscriber handle.


EvtClose(subscription);

CloseHandle(signalEvent);

wprintf(L"Event Subscription Closed !\n");


}

Summary
This chapter provides information about the Windows Eventing 6.0 mechanism. It describes how
to view and handle events in Windows Vista and Windows Server 2008. It also contains technical
information about the way that Windows Eventing 6.0 works and how you can interact with the
mechanism in your own programs. You can define and implement Windows Eventing 6.0 events
in an application using the Team System Management Model Designer.
Chapter 12
Performance Counters
Instrumentation
Windows collects performance data about various system resources using performance
counters. Windows contains a pre-defined set of performance counters with which you can
interact; you can also create additional performance counters relevant to your application. This
chapter describes how to install performance counters, how to write to them, and how to read
existing performance counters.

Example code automatically generated by the Team System Management Model Designer
Power Tool (TSMMD) for the Northern Electronics scenario is used in this chapter to illustrate
its concepts.

Performance Counter Concepts


To work effectively with performance counters, it is important to understand some key
concepts, including the following:
• Categories
• Instances
• Types

The next sections describe each of these in more detail.

Categories
Performance counters monitor the behavior of aspects of performance objects on a computer.
Performance objects include physical components, such as processors, disks, and memory,
system objects, such as processes and threads, and application objects, such as databases, and
Web services.
Counters that are related to the same performance object are grouped into categories that
indicate their common focus. When you create an instance of the PerformanceCounter object,
you first indicate the category (for example, the Memory category) for the object and then
choose a counter to interact with from within that category (for example Cached Bytes).
If you create new performance counter objects for your application, you cannot associate them
with existing categories. Instead, you must create a new category for the performance counter
object.

Instances
In some cases, categories are further subdivided into instances. If multiple instances are defined
for a category, each performance counter in the category also has those instances defined. For
example, the Process category contains instances named "Idle" and "System." Each counter
within the Process category specifies data in these two ways, showing information about either
idle processes or system processes. Figure 1 illustrates the structure of the category and
counters.

Figure 1
Performance counter categories and instances

Although instances are applied to the category, you create an instance by specifying an
instanceName on the PerformanceCounter constructor. If the instanceName already exists,
the new object will reference the existing category instance.

Types
There are many different types of performance counters. Each type is distinguished by how the
performance counter performs calculations. For example, there are counters that are used to
calculate average values over a period of time, and counters that measure the difference
between a current value and a previous value.
The following table lists the most commonly used counter types.

Counter Type Usage Example

NumberOfItems32 Maintain a simple count of You might use this counter type to track
items, operations, and so on. the number of orders received as a 32-bit
number.

NumberOfItems64 Maintain a simple count with You might use this counter type to track
a higher capacity orders for a site that experiences very
high volume; stored as a 64-bit number.

RateOfCountsPerSecond32 Track the amount per second You might use this counter type to track
of an item or operation the orders received per second on a retail
site; stored as a 32-bit number.

RateOfCountsPerSecond64 Track the amount per second You might use this counter type to track
with a higher capacity the orders per second for a site that
experiences very high volume; stored as
a 64-bit number.

AverageTimer32 Calculate average time to You might use this counter type to
perform a process or to calculate the average time an order takes
process an item to be processed; stored as a 32-bit
number.

Some performance counter types rely on an accompanying base counter that is used in the
calculations. The following table lists the base counter types with their corresponding
performance counter types.

Base counter type Performance counter types

AverageBase AverageTimer32
AverageCount64

CounterMultiBase CounterMultiTimer
CounterMultiTimerInverse
CounterMultiTimer100Ns
CounterMultiTimer100NsInverse

RawBase RawFraction

SampleBase SampleFraction

For a detailed description of all the Performance Counter types available, see "Appendix B.
Performance Counter Types."

Installing Performance Counters


Your application cannot increment built-in performance counters, so if you want to have your
application write to performance counters, you will need to create these counters yourself.
Administrative rights are required to install performance counters, so you should install
performance counters before run time. Typically, performance counters are installed when the
application responsible for using them is installed.
You must create counters in a user-defined category instead of in the categories defined by
Windows. That is, you cannot create a new counter within the Processor category or any other
system-defined categories. Additionally, you must create a counter in a new category; adding a
counter to an existing user-defined category will raise an exception.
To install performance counters, you should create a project installer class that inherits from
Installer, and set the RunInstallerAttribute for the class to true. Within your project, create a
PerformanceCounterInstaller instance for each performance counter category, and then add
the instance to your project installer class.
Now you can specify each of the individual custom counters. You should use the
CounterCreationData class to set attributes for each custom counter. This class has the
following properties:

• CounterName. This property is used to get or set the name of the custom counter.
• CounterHelp. This property is used to get or set the description of the custom counter.
• CounterType. This property is used to get or set the type of the custom counter.

If the performance counter relies on a base counter, the performance counter creation data
must be immediately followed by the base counter creation data in code. If it is not, the two
counters will not be linked properly.

If you do not specify a counter type when creating the counter, it defaults to
NumberofItems32.

Now that the individual custom counters are created, they can be added to the
PerformanceCounterInstaller collection. The following code shows a performance counter
named ConfirmPickup in the category WSPickupService being added to the
PerformanceCounterInstaller collection.
C#
using System;
using System.Management.Instrumentation;
using System.ComponentModel;
using System.Diagnostics;
using System.IO;
using System.Text;
using System.Configuration.Install;

namespace PerformanceCounters.InstrumentationTechnology
{
[RunInstaller(true)]
public class PerformanceCountersClass : Installer
{
// constructor
public PerformanceCountersClass()
{
// Installer for performanceCounters with category name: WSPickupService
PerformanceCounterInstaller WSPickupServicePerfCountInstaller = new
PerformanceCounterInstaller();
WSPickupServicePerfCountInstaller.CategoryName = "WSPickupService";
// CounterCreation for event ConfirmPickup
CounterCreationData confirmPickupCounterCreation = new
CounterCreationData();
confirmPickupCounterCreation.CounterName = "ConfirmPickup";
confirmPickupCounterCreation.CounterHelp = "Counter Help"; //n/a now
confirmPickupCounterCreation.CounterType =
PerformanceCounterType.NumberOfItemsHEX32;

WSPickupServicePerfCountInstaller.Counters.Add(confirmPickupCounterCreation);
Installers.Add(WSPickupServicePerfCountInstaller);
}
}
}

Typically, you should not call the methods of the PerformanceCounterInstaller class from within
your code; they are generally called only by the InstallUtil.exe installation utility. The utility
automatically calls the Install method during the installation process. It backs out failures, if
necessary, by calling the Rollback method for the object that generated the exception.

Writing Values to Performance Counters


You write a value to a performance counter in a number of ways.
• You can increment a counter by one using the Increment method on the
PerformanceCounter class.
• You can increment by incrementing the counter's current raw value by a positive or
negative number using the IncrementBy method on the PerformanceCounter class.
• You can set a particular value for a performance counter by using the RawValue
method on the PerformanceCounter class.

Incrementing by a negative number decrements the counter by the absolute value of the
number. For example, incrementing with a value of 3 will increase the counter's raw value by
three. Incrementing with a value of –3 will decrease the counter's raw value by three.

You can only increment values on custom counters; by default, your interactions with system
counters via a PerformanceCounter component instance are restricted to read-only mode.
Before you can increment a custom counter, you must set the ReadOnly property on the
component instance with which you are accessing it to false.

There are security restrictions that affect your ability to use performance counters. For more
information, see "Introduction to Monitoring Performance Thresholds" on MSDN at
http://msdn.microsoft.com/library/en-
us/vbcon/html/vbconintroductiontomonitoringperformancethresholds.asp.
To write values to performance counters
1. Create a PerformanceCounter instance and configure it to interact with the desired
category and counter.
2. Write the value using one of the methods listed in the following table.

To... Call this method Parameter

Increase the raw value by one Increment None

Decrease the raw value by one Decrement None

Increase the raw value by greater than one IncrementBy A positive integer

Decrease the raw value by greater than one IncrementBy A negative integer

Reset the raw value to any integer, instead of RawValue A positive or negative integer
incrementing it

The following code shows how to set values for a counter in various ways. This code assumes
that you are working on a Windows Form that contains a text box named txtValue and three
buttons: one that increments the raw value by the number entered in the text box, one that
decrements the raw value by one, and one that sets the raw value of the counter to the value
set in the text box.
C#
protected override void DoIncrementByPickupServiceConfirmPickup(int increment)
{
using (PerformanceCounter counter
= new PerformanceCounter("WSPickupService", "ConfirmPickup", false))
{
counter.IncrementBy(increment);
}

protected override void DoIncrementPickupServiceConfirmPickup()


{
using (PerformanceCounter counter
= new PerformanceCounter("WSPickupService", "ConfirmPickup", false))
{
counter.Increment();
}
}

Main and base counters must be updated independently.


Connecting to Existing Performance Counters
When you connect to an existing performance counter, you do so by specifying the computer on
which the counter exists, the category for the counter, and the name of the counter itself.
Additionally, you have the option of specifying the instance of the counter you want to use, if
the counter contains more than one instance. You can then read any and all data from the
counter. You can also enumerate the existing categories, counters, and instances on the
computer by using code, or you can use Server Explorer to see a list of existing counters on the
computer.

You may have to restart the Performance Monitor (Perfmon.exe) that is installed with
Windows when you create custom performance counters before you can see the custom
counter in that application.

Performance Counter Value Retrieval


There are several ways you can read performance counter values:
• You can retrieve a raw value from a counter using the RawValue property on the
PerformanceCounter class.
• You can retrieve the current calculated value for a counter using the NextValue method
on the PerformanceCounter class.
• You can retrieve a set of samples using the NextSample method on the
PerformanceCounter class and compare their values using the
CounterSample.Calculate method.

There are security restrictions that affect your ability to use performance counters. For more
information, see "Introduction to Monitoring Performance Thresholds" on MSDN at
http://msdn.microsoft.com/library/en-
us/vbcon/html/vbconintroductiontomonitoringperformancethresholds.asp..

Raw, Calculated, and Sampled Data


Performance counters record values about various parts of the system. These values are not
stored as entries; instead, they are persisted for as long as a handle remains open for the
particular category in memory. The process of retrieving data from a performance counter is
referred to as sampling. When you sample, you either retrieve the immediate value of a counter
or a calculated value.
Depending on how a counter is defined, its value might be the most recent aspect of resource
utilization, also referred to as the instantaneous value, or it might be the average of the last two
measurements over the period of time between samples. For example, when you retrieve a
value from the Process category's Thread Count counter, you retrieve the number of threads for
a particular process as of the last time this was measured. This is an instantaneous value.
However, if you retrieve the Memory category's Pages/Sec counter, you retrieve a rate per
second based on the average number of memory pages retrieved during the last two samples.
Resource usage can vary dramatically, based on the work being done at various times of day.
Because of this, performance counters that show usage ratios over an interval are a more
informative measurement than averages of instantaneous counter values. Averages can include
data for service startup or other events that might cause the numbers to go far out of range for
a brief period, thereby skewing results.
The PerformanceCounter component provides facilities for the most common Windows
performance monitoring requirement, namely, connecting to an existing counter on the server
and reading and writing values to it. Additional functionality, such as complex data modeling, is
available directly through Windows Performance Monitor. For example, you can use
Performance Monitor to chart the data a counter contains, run reports on the data, set alerts,
and save data to a log.
The interaction between raw values, next (or calculated) values, and samples is fairly
straightforward after you understand that raw and calculated values shift constantly, whereas
samples allow you to retrieve a static snapshot of the counter at a particular point in time.
Figure 2 illustrates the relationship between raw value, next value, and samples.

Figure 2
Performance counter values: raw, calculated, and sampled
The diagram in Figure 2 shows a representation of the data contained in a counter named
Orders Per Second. The raw values for this counter are individual data points that vary by
second, where the calculated average is represented by the line showing an increasing order
receipt over time. In this chart, the following data points have been taken:

• The user has used the NextValue method to retrieve the calculated value at three
different times, represented by NV1, NV2, and NV3. Because the next value is
constantly changing, a different value is retrieved each time without specifying any
additional parameters.
• The user has used the NextSample method to take two samples, indicated by S1 and S2.
Samples freeze a value in time, so the user can then compare the two sample values
and perform calculations on them.
Comparing Retrieval Methods
Retrieving a raw value with the RawValue property is very quick, because no calculations or
comparisons are performed. For example, if you are using a counter simply to count the number
of orders processed in a system, you can retrieve the counter's raw value.
Retrieving a calculated value with the NextValue method is often more useful than retrieving
the raw value, but this value may also give you an unrealistic view of the data because it can
reflect unusual fluctuations in the data at the moment when the value is calculated. For
example, if you have a counter that calculates the orders processed per second, an unusually
high or low amount of orders processed at one particular moment will result in an average that
is not realistic over time. This may provide a distorted view of the actual performance of your
system.
Samples provide the most realistic views of the data in your system by allowing you to retrieve,
retain, and compare various values over time. You would retrieve a sample, using the
NextSample method, if you needed to compare values in different counters or calculate a value
based on raw data. This may be slightly more resource-intensive, however, than a NextValue
call.
The NextSample method returns an object of type CounterSample. When you retrieve a
sample, you have access to properties on the CounterSample class, such as RawValue,
BaseValue, TimeStamp, and SystemFrequency. These properties let you get a very detailed look
at the data that makes up the sample data.

Summary
This chapter demonstrated how to create performance counters in custom performance
categories and how to connect to existing performance counters so that performance counter
data can be retrieved. For more detailed information about specific performance counter types,
see "Appendix C: Performance Counter Types."
Chapter 13
Building Install Packages
In this preliminary version of the guide, this chapter is still currently under development. It is
anticipated that a future release of the guide will include detailed information about building
install packages for instrumented applications.
Section 4
Managing Operations
This section focuses on the tasks performed by the operations team when managing
applications. It demonstrates an application in use and describes event log events, performance
counters, Windows Management Instrumentation (WMI) events, and event trace entries for the
application. It examines some of the important concepts involved in creating Management
Packs for Microsoft Operations Manager (MOM) 2005 and System Center Operations Manager
2007, and it describes the tasks involved in creating those Management Packs, including
importing Management Packs from the Management Model Designer (MMD) tool and the
TSMMD.
This section should be of use primarily to the operations team and for Management Pack
developers.
Chapter 14, "Deploying and Operating Manageable Applications"
Chapter 15, "Monitoring Applications"
Chapter 16, "Creating and Using Microsoft Operations Manager 2005 Management Packs"
Chapter 17, "Creating and Using System Center Operations Manager 2007 Management Packs"
Chapter 14
Deploying and Operating Manageable
Applications
After you define your application in the TSMMD, generate instrumentation for the application,
and call the instrumentation code from your application, you are ready to deploy the application
in a test, and ultimately a production environment. It is at this point that the instrumentation
you have created can be used by the test and operations teams.
This chapter uses the Transport Order application and the Transport Order Web service, two
parts of the Northern Electronics example, to illustrate how instrumentation can be created for
an application.

Deploying the Application Instrumentation


When deploying an application instrumented using the TSMMD tool, as well as installing the
application, you will also need to install the instrumentation used by the application. This
involves building the solutions and then running the installation utility, Installutil.exe, against
each instrumentation technology DLL. You may encounter the following technology DLLs:
• EventLogEventsInstaller.dll
• WindowsEventing6EventsInstaller.dll
• PerformanceCountersInstaller.dll
• WmiEventsInstaller.dll

For more information about application instrumentation and how it is installed, see Chapters
8–12 of this guide.

Running the Instrumented Application


In this scenario, the Transport Order application presents a page where the user can select an
order placed by a customer, edit the delivery details, and send the order to the transport
service. Figure 1 illustrates this page.
Figure 1
First page of the example Transport Order application

The dates shown in this and other screenshots in this chapter are shown in dd/mm/yyyy
format.

However, in operation, the Transport Order application depends on configuration values such as
the URL of the Transport Order Web service. If this configuration information is incorrect,
perhaps because the operations team has been reorganizing the servers or the Transport Order
Web service has experienced a failure, posting the order results in an error. The application
detects that the order submission failed and displays a message on the left side of the page, as
shown in Figure 2.
Figure 2
The result when the application cannot contact the Transport Order Web service

Event Log Instrumentation


Comprehensive instrumentation is included for this application, meaning that the operations
team, when informed of the problem, can open the Windows Event Log (either locally or
remotely), and see the event message shown in Figure 3.
Figure 3
Details of the application failure in the Windows Event Log
In this case, the error message probably does not provide sufficient information to help to the
operations team. It simply contains the same text as the message displayed on the Web page.
The developer has correctly caught the exception and written it to the event log, but there is no
way of knowing why the remote server that implements the Transport Order Web service did
not respond, unless the operations team is aware of a particular cause of this error. To solve this
problem, more information about this error should be included in the TSMMD model for the
application, and the instrumentation should then be regenerated.
If the operations team is aware of the error or a specific application dependency, they can open
the Web.config file for this application to investigate further. In this case, they find that, as
shown in Figure 4, the value for the TOA.WebServiceProxies.Transport key contains the
incorrect port number, which they can easily correct.
Figure 4
The Web service configuration value in the Web.config file
After the operations team corrects the port number, the application behaves correctly. After
posting the order to the Transport Order Web service and receiving an indication that it
succeeded, the Web page clears the existing values from the controls and allows the user to
select another order.

Performance Counter Instrumentation


The instrumentation that the developers implemented includes performance counters that
provide a record of the execution of the Transport Order Web service. Operations staff can use
the Performance utility within the Windows operating system to examine the current values and
history for the average document processing time, and the total number of requests placed over
a period, as shown in Figure 5.

Figure 5
The performance counters exposed by the Transport Order Web service
WMI
The instrumentation for the application includes Windows Management Instrumentation (WMI)
events. Figure 6 shows the WMI event generated when the database is stopped.

Figure 6
WMI event raised from the Transport Order application

Trace File Instrumentation


The instrumentation for the application also includes trace file entries. Figure 7 shows the trace
file entry generated when the database is stopped.
Figure 7
Trace file entry raised from the Transport Order application

Summary
This chapter showed how to install the application instrumentation and demonstrated the use
of application instrumentation for the Transport Order application, which forms part of the
Northern Electronics example.
Chapter 15
Monitoring Applications
Defining a management model for an application and using the management model to ensure
that your application is well instrumented and can report health state is an important
requirement for designing manageable applications. However, a manageable application is not
all that is required to ensure that an application is easy to manage by the operations team. You
must also have a solution for monitoring the application, such as Microsoft Operations Manager
2005 or System Center Operations Manager 2007.
This chapter examines how to monitor applications; as an example, it uses Operations Manager
2005 monitoring the Transport Order Web service (part of the Northern Electronics Scenario).

Distributed Monitoring Applications


Most monitoring applications and environments, such as IBM Tivoli, CA Unicenter, Operations
Manager 2005 and Operations Manager 2007, use a central server to collect, store, and expose
information about remote applications and systems using agents installed on these remote
computers. In larger installations, there may be several services collecting information from
their own subsets of remote client computers, collating the data, and passing it back to the
central monitoring server. Figure 1 illustrates a simple Operations Manager installation.
Figure 1
The main components of a typical Microsoft Operations Manager environment
The agents installed on the remote client computers that collect the information can be
managed agents, which run on computers running Windows, or unmanaged agents, which run
on computers running other operating systems. These agents collect and send back information
about their own performance (so the central monitoring server can detect them and check that
they are executing correctly), the basic parameters of the host system (such as processor and
memory usage), and any other information specified by the rules within the Management Packs
installed on the central monitoring server.
In some cases, it is not possible to install an agent on remote computers. When this is the case,
you can take advantage of Operations Manager's support for agent-less monitoring. Using
remote procedure calls (RPC) and DCOM method invocation, the management server can
provide most of the monitoring features as monitoring through a remote agent. This requires
the providers used within the application to support remote access, and the account used by the
Operations Manager server must have administrative permissions on the remote computers.
Most monitoring systems, including MOM, also provide a connector framework that allows
other monitoring systems to interact with the central server(s) and custom clients to provide
information about remote computers.
To view the information collected by the remote client computers, and to monitor application
and system performance, monitoring systems, such as Operations Manager, provide remote
consoles that operations staff can use to administer the system and view the state of monitored
applications. As you administer the system by adding, editing, and removing rules, the central
server sends these as configuration changes to all the remote agents, which store this
configuration locally and use it to determine the events and other data to send to the central
server.
Most monitoring systems also provide a reporting feature that allows administrators and
operators to create historical reports from the data collected by the agents. These reports are
often useful in providing indications of performance degradation over time and detecting
impending issues.

Management Packs
Microsoft Operations Manager, like most other monitoring applications and environments,
relies on a series of Management Packs that defines the rules, views, and alerts for a specific set
of monitoring processes. Each Management Pack contains a rule group, which contains the set
of rules applicable to the monitored application or system.
The usual approach is to create a Management Pack that matches the management model you
create for your application and install it along with the standard Management Packs that
monitor other features and systems. For example, the Management Pack generated by the
Management Model Designer (MMD) and used with the Northern Electronics application
contains a series of rules and alerts that map directly to the instrumentation within the
application. Separate Management Packs (provided with Operations Manager) monitor the
basic features of the remote computers as sent by the agents installed on the remote
computers.

This division of monitoring tasks into separate functional areas means that your Management
Pack should only include rules that directly relate to your application and that measure
features your application can influence. As an example, you should not include a rule to
monitor the amount of free memory in your application Management Pack, because this does
not directly relate to your application processes. Instead, you install and use the Management
Pack that contains the remote sever information and use this to monitor all the non-
application related features, such as processor loading and memory usage.

Rules and Rule Groups


Subsequent chapters include detailed procedures that describe the process of creating a
Management Pack. This brief overview introduces you to Management Packs and rules. This will
help you to understand the way you can map the instrumentation of an application to a set of
monitoring rules.
Figure 2 illustrates the Administrator Console in Microsoft Operations Manager 2005 with a
Management Pack for the Transport Order application installed. This is a very simple example of
a Management Pack that consists of only a rule group named TransportOrderApplication that
defines three event rules and two performance-processing rules.

Figure 2
The MOM 2005 Administrator Console showing the TransportOrderApplication rule group

You can use a rule group to associate a set of rules with an individual server or a group of
servers. You can enable and disable a complete group, and display and work with just a single
group, without associating each rule that you add or modify directly with one or more servers.
This makes management and monitoring much easier, especially as the architecture and
deployment of an application change over time.

Figure 3 illustrates some of the properties of the TransportWebServiceFailed event rule. This
rule takes as its source the application event log on the monitored server (where the main
Transport Order application runs) and uses criteria to select the event log entries that
correspond to the "Unable to connect to the remote server" error. When it detects this event, it
generates a critical error alert within the monitoring system, using the source and description of
the original event log entry for the new alert.
Figure 3
The event rule for the TransportWebServiceFailed event
Figure 4 illustrates the TransportServiceResponseTime performance rule. In this case, the
source of the values for the rule (the provider) is the AverageDocumentProcessingTime
performance counter implemented by the instrumentation within the Transport Order Web
service. In this example, the monitoring system interrogates the counter every minute and
stores the values so it can present a graph of performance over time.
Figure 4
The Performance rule for the Transport Order Web service response time
With these rules in place, the Operator Console will display the overall state of the application
(based, of course, on the defined rules) by rolling up the individual values of each alert raised by
the event and performance rules. You can specify how these rules roll up (how they combine
when there is more than one monitored entity). In this example with only a single monitored
instance, the overall state directly reflects the worst case—this is described in greater detail in
Chapters 17 and 18 of this guide.
Figure 5 illustrates the state view in the Operations Manager 2005 Operator Console. You can
use the Group list on the toolbar at the top of the window to select the displayed scope; in this
case, it is set to show only the TransportApplication rule group. This rule group is associated
with only a single server named DELMONTE that (in this simple scenario demonstration)
implements both the main Web application and the Transport Order Web service. You can see
that there are no open (in other words, unresolved) alerts for the entire application running on
this server.
Figure 5
Monitoring the overall state of the Transport Order application running on a single remote server

Monitoring the Example Application


The Transport Order application contains extensive instrumentation for many different kinds of
exceptions that might occur, including general errors, such as the user entering invalid values for
the order parameters. For example, as illustrated in Figure 6, omitting the Expected Weight
value when submitting the order raises an error indicated by the message on the left side of the
page.
Figure 6
The result of a missing order parameter in the example Transport Order application
This is not strictly a failure of the application, but it is not an event designed to occur. The
instrumentation in the application writes an entry in Windows Event Log indicating the error. (A
similar error occurs if the value is not numeric or if the user enters incompatible values for other
parameters, such as From and To dates that do not define a valid period.)
When this event occurs, and the application adds an entry to the event log, the local Operations
Manager agent running on this computer passes details of the event log entry to the Operations
Manager central server. This causes the state for the application to change to that specified in
the rule that applies to this event—in this example, it changes to a Warning state. As you can
see in Figure 7, the Operations Console reflects this change in the application state, and displays
the rolled-up state as a Warning.
Figure 7
The State view of the monitored application, showing a Warning state
The operations team is now immediately aware that the application state has changed
(Operations Manager can send an e-mail message, a system alert, or a pager message when an
alert occurs), and they can investigate. Switching to Alerts view (or double-clicking the entry for
the server) displays details of all the current alerts. In this case, the issue is not critical and
simply indicates that the user did not enter valid values. However, as you can see in Figure 8, the
Properties view for the alert contains a great deal of useful information about the event that
caused the alert.
Figure 8
The Properties view for an alert, showing the values from the event log and other useful diagnosis
information
One of the core tenets of management modeling is that your management model should
incorporate knowledge that helps operations staff to diagnose a problem, resolve it, and verify
the resolution. This knowledge contains both application-specific content (usually provided by
the architect and developer), and company-specific knowledge (some of which is generated
during and after installation of the application).
Each rule you create in Operations Manager can store both product knowledge and company
knowledge, and the Operator Console presents this knowledge as you view each alert. For
example, Figure 9 illustrates the product knowledge incorporated into the rule, including a
summary, likely causes (the diagnostic information), and resolution information.
Figure 9
The Product Knowledge view for an alert, showing diagnosis and resolution details
Figure 10 illustrates the Company Knowledge view, which contains an Edit button that allows
operators with the appropriate access permission to edit the knowledge. There is also a view
that displays alert history that allows operators to quickly see how and when this alert occurs.
Figure 10
The Company Knowledge view for an alert, which you can edit to provide up-to-date information

One useful feature of monitoring and recording user errors (as opposed to application faults) is
that you gain an insight into the usability of the application and the kinds of problems that
users face when using it. In this particular example, you may decide that some mechanism that
prevents users submitting orders with no Expected Weight value (such as client-side
validation) would reduce the loading on the servers and make the application easier to use.
This is useful feedback for the architect and developer; it is automatically collected and reflects
actual usage instead of user perception and opinion.

Monitoring the Remote Web Service


Collecting events that correspond to user errors is useful, but the fundamentally more
important aspect of monitoring is to be able to detect failures, application performance issues,
and other problems that directly affect business processes.
The example Management Pack for the Transport Order application contains two rules that
detect failure of the Transport Order Web service:
• TransportOrderServiceFailed. This rule maps event log entries created by the main
Web application when it fails to connect to the remote Web service to an alert in MOM.
• TransportWebServiceFailed. This rule maps event log entries created by the Transport
Order Web service when it encounters an error to an alert in MOM.

As soon as the Transport Order Web service fails, the Operations Manager agent on the Web
server sends details of the event log entry to the central monitoring server, which automatically
changes the state to the worst of all currently unresolved alerts. The
TransportOrderServiceFailed rule maps event log entries to a critical error alert, so this is the
state displayed in the Operator Console (as shown in Figure 11).

Figure 11
The critical error state caused by failure to connect to the Transport Order Web service
Viewing details of the alert provides very little useful information—mainly the information that
is visible in Windows Event Log, plus a count of the number of times that Operations Manager
detected this event, the period within which they occurred, and the mapped rule name (see
Figure 12).
Figure 12
Alert details and summary for the alert raised by the TransportOrderServiceFailed rule
However, the TransportOrderServiceFailed rule specifies both product knowledge and company
knowledge that is directly useful for diagnosing and resolving the problem indicated by this
alert. For example, as you can see in Figure 13, the CAUSES and RESOLUTIONS sections identify
the configuration error that causes this failure to connect to the remote service and provides
the correct value (or points to where the operator could obtain the current value).
Figure 13
The product knowledge provided by the TransportOrderServiceFailed rule
Figure 14 illustrates the company knowledge for this rule. Operators could use this editable area
to store the correct current value for the Transport Order Web service or notes that indicate
how to deduce or discover its location if it changes on a regular basis.
Figure 14
The company knowledge provided by the TransportOrderServiceFailed rule
After correcting the incorrect configuration value, the operations staff can reattempt the
request to ensure it completes successfully. In fact, like in the example application, the target
Web service may expose a simple method that does no processing; instead, it simply indicates
successful connection to the service. In this case, the knowledge will include details of how to
execute this method to verify resolution of the connection problem.
In the example application, having resolved the connection problem, the operations staff might
now discover that the Transport Order Web service itself is failing. However, this is not directly
obvious because the only indication is that the controls on the Web page remain populated with
the original values, even after posting the order to the Transport Order Web service, as shown in
Figure 15. In normal circumstances, as demonstrated earlier, code in the application clears the
controls and allows the user to select another order.
Figure 15
Failure of the Transport Order Web service is not directly obvious in the Transport Order
application
However, the monitoring system shows the real situation, because the
TransportWebServiceFailed rule maps event log entries created by the Transport Order Web
service when it encounters an error to an alert in Operations Manager. Figure 16 illustrates the
new critical error alert at the top of the list and the event log message in the lower-right part of
the window. In this case, there is much more information in the error message, including the
useful fact that the code detected the DataBaseName key missing in the application
configuration file.
Figure 16
The alert created by the failure of the Transport Order Web service
The product knowledge stored within the rule indicates in more detail why this error occurred
and how to resolve it. The RESOLUTIONS section indicates that the DataBaseName key should
have the value "Transport", as shown in Figure 17.
Figure 17
The product knowledge provided by the TransportWebServiceFailed rule
Looking at the Web.config file for the Transport Order Web service, it becomes obvious why this
error occurred—someone has commented out the DataBaseName key, as shown in Figure 18.
Removing the enclosing comment markers "!--" and "--" and running the application again
results in successful execution of the Transport Order Web service.

Figure 18
The error arises because the DataBaseName key is commented out
From this simple example, you can see just how the combination of a suitable health model with
the appropriate instrumentation and application monitoring makes it much easier to detect,
diagnose, and resolve problems in complex and distributed applications. It solves the following
three issues encountered at the start of this chapter:
• The operations team no longer needs to rely on users to detect and report faults.
Sufficient and accurate information in the form of knowledge stored within the health
model and the monitoring rules make diagnosis and resolution of faults easier, less
costly, and less time-consuming.
• The operations team does not have to visit the computer to investigate, nor do they
have to depend on scant information they may extract from the event logs or
performance counters. The health model knowledge provides the detailed data
required to resolve the fault.
• The operations team can easily detect problems early, such as impending failure of a
connection to a remote service caused by a failing network connection or lack of disk
space on the server, without having to continuously monitor performance counters and
event logs or use them as the sole sources of information for diagnosing faults.

Summary
Defining a management model for an application is very important in ensuring that it can be
managed by the operations team. However, without an effective way of monitoring the
instrumentation that is generated by your application, the application may still prove difficult to
manage. This chapter explained the benefits of monitoring applications and explained some of
the most important components of monitoring software, using Operations Manager 2005 as an
example. The following chapters will examine the use of Operations Manager 2007 and
Operations Manager 2007 Management Packs in more detail.
Chapter 16
Creating and Using Microsoft
Operations Manager 2005
Management Packs
As discussed in Chapter 15, the basis for monitoring applications in Microsoft Operations
Manager 2005 are Management Packs that describe the rules, views, and alerts for a specific set
of monitoring processes. This chapter describes a number of scenarios of creating and using
Management Packs. It includes detailed information about the following:

• Importing a Management model from the Management Model Designer


• Creating a Management Pack in the Operations Manager 2005 Administrator Console
• Editing an Operations Manager 2005 Management Pack
• Viewing Management Information in Operations Manager 2005
• Creating Management Reports in Operations Manager 2005

The Transport Order application uses as a running example throughout this chapter. This
application forms part of the Shipping solution in the Northern Electronics worked example
used throughout this guide.

Importing a Management Model from the MMD into Operations


Manager 2005
Operations Manager 2005 provides many manageability benefits to the operations team, but it
does not allow you to define directly the type of management model for your application
previously described in this guide. In fact, Operations Manager 2005 does not allow you to
specify health state information such as the RED, YELLOW, and GREEN health aspects commonly
used in a management model. As a result, in many cases, you are likely to design your
application management model using the Team System Management Model Designer Power
Tool (TSMMD) or the Management Model Designer (MMD) tool.
As shown in Chapter 7 of this guide, an application management model defined in the TSMMD
tool can be exported to a management pack for Operations Manager 2005, and edited within
Operations Manager as required. Finally, you can specify the associations between the rules and
the computers that will run the application, and deploy the rules to the Operations Manager
Agents on those computers.
To import a Management model from the Management Model Designer into Operations
Manager 2005
1. In the tree-view pane of the Administrator Console, expand the tree until
Management Packs is visible. Right-click Management Packs, and then click
Import/Export Management Pack.
If the tree-view pane is not visible, click Customize on the View menu. In the
Customize View dialog box, select the Console tree check box, and then click OK.
2. On the first page of the wizard, click Next.
3. On the Import or Export Management Packs page, select the Import Management
Packs and/or Reports radio button, and then click Next.
4. On the Select Folder and Choose Import Type page, select the folder that contains
the Management Pack (the .akm file) you exported from the Management Model
Designer. Under Type of import, select the Import Management Packs only radio
button, and then click Next.
5. The Select Management Packs page displays a list of the Management Packs located
in the folder you specified on the previous page. Select the Management Pack you
created with the Management Model Designer. If you want to import more than one
Management Pack, press and hold the SHIFT key or CTRL key while clicking each
Management Pack you want to import.
6. On the same page of the wizard, under Import Options, select the way you want
Operations Manager to update any existing Management Packs with the same
name:
◦ If you want to update an existing Management Pack, select the first option.
Operations Manager will retain any custom settings and knowledge from
existing rules in this Management Pack and import only changes to the
Management Pack.
◦ If you want to replace the existing Management Pack, select the second option.
If you are importing a new Management Pack that does not already exist in
Operations Manager, you can use the default "update" setting because there
will be no existing rules to update, so Operations Manager will create a
completely new set.
7. If you want Operations Manager to create backups of the Management Packs it
updates, select the Backup existing Management Pack check box. In the
Management Pack backup directory text box, type the name of the folder you want
the backup file to be placed in or use the Browse button to select the folder, and
then click Next.
8. On the Confirmation page of the wizard, click Finish. The Import Status dialog box
shows the status of the import process. When the import process completes, it
provides details of each stage of the process and indicates success or failure. It
shows the backup and import operations, detailed information about the operation,
the name of the imported file, the status, and a description. To create a file that
includes all the import steps, click the Create Log File button, as shown in Figure 1.
Figure 1
The status report displayed after importing a Management Pack

Viewing the Management Pack


You can now view the new Management Pack in the Operations Manager Administrator
Console. The Management Model Designer (MMD) generates a Management Pack that contains
multiple nested rule groups corresponding to the individual components and levels of the
original management model defined in the MMD, as shown in Figure 2. Within each level,
Operations Manager generates the event rules, alert rules, and performance rules defined in the
management model. It also automatically generates an alert rule at the top level that creates a
notification response to the network administrators when any rule with a severity of Error or
higher causes the state of the application to change.
Figure 2
The Management Pack imported into the Operations Manager 2005 Administrator Console

For more details about rule groups and rules, see "Creating and Configuring a Management
Pack in the Operations Manager 2005 Administrator Console".

As an example of the way that the MMD translates a management model into a Management
Pack, Figure 3 illustrates the General page of the Properties dialog box for the
TransportOrderUIErrors event rule. This rule uses Warning entries in Windows Application
Event Log to detect input errors by the user. In the management model, this causes a state
change to YELLOW for the user interface application, and the MMD appends this state to the
event name.
Figure 3
The General page of the Properties dialog box for a newly imported event rule
The MMD uses the values you enter when specifying the detector in the management model (in
this case, the TransportOrderUIErrors event) to generate the appropriate criteria for the event
rule. As shown in Figure 4, the MMD sets the Source of the event, and generates a regular
expression for the event ID to match that specified in the management model.
Figure 4
The Criteria page of the Properties dialog box for a newly imported event rule
The Alert page of the Properties dialog box shows that the MMD set the severity to Warning
(equivalent to the YELLOW health state), and specified that Operations Manager should create
an alert when this event occurs. It uses the values of the Source and Description from the event
to populate these fields of the alert, as shown in Figure 5.
Figure 5
The Alert page of the Properties dialog box for a newly imported event rule
The MMD uses the knowledge included in the management model to generate information for
the rules it generates. Figure 6 shows the Knowledge Base page for the new
TransportOrderUIErrors event rule, which contains sections labeled Summary, Diagnose,
Resolve, and Verify. These correspond directly to the steps defined in the management model
for monitoring and maintaining the application.
Figure 6
The Knowledge Base page of the Properties dialog box for a newly imported event rule
Figure 7 shows the Threshold page of the Properties dialog box for a newly imported
performance rule. The MMD sets the Threshold value and Match when the threshold meets
the following condition options for the rule based on the aspects you specified when creating
the management model. In this case, the rule will generate an alert and indicate a state change
when the value of the performance counter that measures the Transport Order Web service
response time exceeds 4999 (milliseconds).
Figure 7
The Threshold page of the Properties dialog box for a newly imported performance rule
After importing a Management Pack from the Management Model Designer, you may need to
edit it, add new rules, or change the behavior of some sections. The remaining procedures in
this chapter show these processes in detail.
You also need to generate computer groups in the Administrator Console that correspond to the
sets of computers that will run the application and deploy the rules to these computers. For
more information, see Create an Operations Manager 2005 Computer Group and Deploy the
Operations Manager Agent and Rules.

Guidelines for Importing a Management Model from the


Management Model Designer
When importing a management model from the MMD, you should consider the following
proven practices:
• Ensure that you provide all the required information, including the knowledge that
describes Diagnosis, Resolution, and Verification procedures, when you create your
management model in the Management Model Designer (MMD).
• Use the validation features of the MMD to make sure that there are no errors or
missing information before generating the Management Pack.
• Use the import options in Operations Manager 2005 to generate a backup of the
existing Management Pack if you are updating it or replacing it.
• After importing the Management Pack into Operations Manager 2005, make sure that
there are no conflicts and edit the rules as required.
• Create the required computer groups and associate and deploy the rule groups in the
new Management Pack to the appropriate computers.

Creating and Configuring a Management Pack in the Operations


Manager 2005 Administrator Console
If you have not used the MMD tool (or another tool that can export *.akm files) as part of
defining the management model for your application, you will need to use the Administrator
Console in Operations Manager 2005 to create a Management Pack for your application. This
can be complex, because as mentioned earlier, there are no direct mappings between the many
of the concepts contained in a management model and an Operations Manager 2005
Management Pack. However, by using rules to detect events and performance counters, you
can update state variables that correspond to the health state of an application. The rules may
also send a message to operators through e-mail or pager. The monitoring system also allows
you to create alert rules that combine different events and performance counters to tailor the
alert to match exactly the requirements of both the application and the operations staff.
By assigning individual rules to groups, you can associate a group with a specific section or
component of the application, which may correspond to a managed entity. This makes it easier
to update the monitoring configuration when the physical layout of the monitored application
and its components changes over time. You can also assign knowledge from the management
model that is common to a set of rules to the rule group; this reduces duplication of effort and
makes knowledge updates easier.
This section contains the following procedures:
• To create a new Management Pack and rule group
• To create an event rule for a rule group
• To create an alert rule for a rule group
• To create a performance rule for a rule group

This section contains only enough information to create a Management Pack with rule groups
and rules in place. In many cases, you will need to perform additional editing to the
Management Pack. For more information about these other tasks, see "Editing an Operations
Manager 2005 Management Pack" later in this chapter.

To create a new Management Pack and rule group


1. In the tree view pane of the Administrator Console, expand the list until Rule Groups
is visible under Management Packs, and then right-click Rule Groups. If the shortcut
menu contains Enable Authoring mode, click it, and then click Yes in the
confirmation dialog box. If the shortcut menu contains Disable Authoring mode, you
are already in authoring mode.
If the tree-view pane is not visible, click Customize on the View menu. In the
Customize View dialog box, select the Console tree check box, and then click OK.
2. If this is the first rule group for your application, you must create a top-level (parent)
group. To create a top-level rule group, right-click Rule Groups, and then click Create
Rule Group.
3. If you have already created a top-level rule group for your application, you can
create nested (child) rule groups within that top-level group. To create a child rule
group, right-click the top-level rule group entry, and then click Create Rule Group.
4. On the General page of the Rule Group Properties Wizard, type a name for the rule
group and, optionally, a description. Make sure the Enabled check box is selected,
and then click Next.
5. On the Knowledge Base page, click Edit, and then enter any company-specific
knowledge for this rule group. For example, you can record the name and location of
the application, the application purpose and owner, or any other information that is
common to all the rules you will create for this group, and which may be useful to
operators and administrators. Click OK, and then click Next.
6. On the Advanced page, you can reconcile Management Packs imported from
Operations Manager 2000 with the changes to the way Operations Manager 2005
handles rule groups when you come to export the rule group. When you create a
new rule group, you only need to select the way Operations Manager will export the
rules in this group. By selecting the first or second option in the Rule Group
ownership options section, you ensure that exported Management Packs contain all
the rules, not just the rules you have modified since you installed or imported the
rule group. And, because you will usually not want to preserve deleted rules in the
exported Management Pack, select the default option Export as a vendor produced
rule. If the rule is disabled, then do not export. Also, make sure the Mark this rule
group as deleted check box is clear. Click Next.
A rule group may contain child (nested) rule groups, which can make it easier to
administer monitoring for large or very complex applications by providing a facility to
set the parameters of rules in multiple child groups in one operation. Operations
Manager 2000 allows a child rule group to link to multiple parent rule groups. Later
versions of Operations Manager do not support this, so you must disable the links
between parent and child groups that violate this condition in imported Management
Packs using the list at the top of this page.
7. On the Knowledge Authoring page, you can provide content for the product
knowledge of a rule group. Under Sections, click Purpose, and then enter
information about the purpose of the application and the rule group. Repeat the
process by clicking Features and Configuration and entering the appropriate
information.
8. Click Finish to create the new rule group. You will be prompted to deploy the rules in
the new rule group to a group of computers. Click No because there are no rules in
the new group.
9. The new rule group appears in the left-side tree view. Select it and expand the nodes
below it to see the three rule categories that Operations Manager automatically
adds to each rule group: Event Rules, Alert Rules, and Performance Rules. These are
all empty. The right pane of the Administrator Console shows a summary of the
properties and Company Knowledge for the Rule Group (see Figure 8).

Figure 8
A new rule group in Microsoft Operations Manager 2005

To create an event rule for a rule group


1. In the left tree view of the Administrator Console, expand the list until Rule Groups
is visible (it is under the Management Packs entry). Expand the rule group to which
you want to add the new event rule.
2. In the tree view, right-click Event Rules, and then click Create Event Rule to open
the Select Event Rule Type dialog box.
3. In the Select Event Rule Type dialog box, select the type of rule you want to create
from the list. The rule types allow you to do the following:
◦ Alert on or Respond to Event. The rule will generate a single alert, or perform a
process you define one time, for each occurrence of the specified event.
Operations Manager will not process any more rules that may match this event
occurrence.
◦ Filter Event. The rule will generate a single alert, or perform a process you
define one time, for each occurrence of the specified event. Operations
Manager will then continue to process other rules that match this event
occurrence.
◦ Detect Missing Event. The rule will generate a single alert, or perform a process
you define one time, if an event you specify does not occur during a specified
period on specified days. You can use this rule to detect failures where the
component or application generates a "heartbeat" event on a regular basis.
◦ Consolidate Similar Events. The rule will generate a single alert, or perform a
process you define one time, for a specified combination of events that occur
within a duration you specify. You consolidate the events by specifying the event
fields that will have identical values.
◦ Collect Specific Events. The rule will generate a single alert, or perform a
process you define one time, for a specified combination of events that occur
within a duration that you specify. You can choose whether to store some or all
of the event parameters or to just discard them.
4. After selecting the required rule type, click Next to open the Data Provider page of
the Event Rule Properties Wizard. Here, you select the source of the event in the
Provider name drop-down list box. For example, the Operations Manager agent may
collect events from a specified Windows Event Log (such as Application, System,
DNS Server), detect an SNMP event through Windows Management
Instrumentation (WMI), detect an internally generated or a script-generated event
(a generic event), or occur on a fixed schedule that you choose (a timed event).
5. If the event source or data source you require is not in the drop-down list box, you
can specify your own event provider by clicking New and then entering the required
data. For example, you can specify an IIS Web server or FTP server log file, a custom
log file, a custom timed event period, or a custom WMI provider. After you specify
the required information in the Provider name and Provider type boxes, click Next
to open the Criteria page.
6. On the Criteria page, specify the criteria that will match the event you want to
handle. Select the check boxes that correspond to the criteria you want to specify,
such as the text values of the Source, ID, Type, and Description of the event. If you
are handling events from Windows Event Log, you can usually obtain these values by
examining the event in the Event Log viewer (see Figure 9).
Figure 9
The Source, ID, Type, and Description values for a Windows Event Log entry
7. As you enter criteria, the Criteria description section shows a summary of these
criteria. If you need to apply more specific criteria, click the Advanced button to
open the Advanced Criteria dialog box, where you can select any of the fields for an
event (including parameters for a WMI or custom event, log file names, and more),
and specify criteria to select only the events you want. You can also use the
Advanced Criteria dialog box to match any values in the event using different
conditions, such as partial string matching, regular expressions, and numerical value
order comparisons. After you construct the required criteria combination, click Close
in the Advanced Criteria dialog box. On the Criteria page, click Next.
8. If you are creating a Collect Specific Events rule, the next page is the Parameter
Storage page. Here, you can specify whether Operations Manager should collect the
parameters of each event. The default is to store no parameters, but you can use the
option buttons on this page to specify that you want to store all the parameters
from all matching events or just specific named parameters. After you select the
required option, click Next to show the Schedule page.
9. By default, Operations Manager will always process events, but you can change this
behavior on the Schedule page by specifying time spans when Operations Manager
will or will not process events. In the drop-down list box, click either Only process
data during the specified time or Process data except during the specified time,
and then specify the start time and end time. Use the check boxes below the drop-
down list box to select the days of the week to which this setting applies, and then
click Next.
If you are creating a Detect Missing Event rule, you use the Schedule page to specify
the period during each day when you expect the event to occur. You must specify a
schedule for this type of event rule.
10. The page you see next depends on the type of rule you are creating, and source of
the event for this rule. If you specified as the source of this rule a mechanism that
creates an event, such as Windows Event Log or a WMI event, and you are creating
an Alert on or Respond to Event rule or a Detect Missing Event rule, the next page is
the Alert page. On this page, you specify whether the event(s) detected by this rule
will generate an alert (which will appear in the Operator Console) and details about
the alert. The Alert page allows you to do the following:
◦ Specify whether the event will generate an alert (for which you will create an
alert rule) by selecting the Generate alert check box.
◦ Turn on display of health state for this rule and alert severity condition checking
by selecting the Enable state alert properties check box.
◦ Specify the alert severity (such as Critical Error, Warning, or Success) if you want
to always generate the same severity alert for this event. Alternatively, click the
Edit button to open the Alert Severity Calculation for State Rule dialog box,
where you specify a series of If conditions and an Else condition so the severity
(and therefore the health state displayed in the console) depends on the
parameter values for the event. This allows you to define, for example, that a
particular event will generate a Service Unavailable condition for specific values
of the parameters, and a Success condition for other values. You must specify at
least one condition that causes a RED state (Critical Error, Service Unavailable),
and one that causes a GREEN state (Success). You can also specify conditions
that cause a YELLOW state (Warning).
◦ Specify the name of the person responsible for tracking and resolving the alert
as the Owner. This allows Operations Manager to direct the alert to the
appropriate administrators and operators listed in the Notification Groups
section of the Administrator Console.
◦ Specify the Resolution state for the alert. By default, this is New, but you can set
it to Assigned to a group of people such as a helpdesk, vendors, or mark it as
C

requiring scheduled maintenance. You can use the Global Settings section of the
Administrator Console to modify or define new resolution states. For more
information, see the later section "Viewing and Editing Global Settings"
◦ Specify the value for the Alert source. This is the text displayed as the Source in
the Operator Console when this alert occurs. You can enter custom text or select
from any of the fields in the event that causes this alert. The default is to use the
Source field value.
◦ Specify the value for the Description. This is the text displayed as the
Description in the Operator Console when this alert occurs. You can enter
custom text or select from any of the fields in the event that causes this alert.
The default is to use the Description field value.
◦ Specify details of the role of the server in the alert process using the Server role,
Instance, Component, and Customer Fields options.
Not all of the controls on the Alert page are available for every type of event
rule. Depending on the type of rule and the provider source, some of the
controls may be disabled.
11. If you are creating a Consolidate Similar Events rule, the next page is the Consolidate
page. Use the check boxes in the list of event fields to specify those that must have
identical values in order for Operations Manager to consolidate multiple events into
a single alert. You can also specify the period within which the multiple events must
occur as a number of seconds. Operations Manager will only raise one alert in the
Operator Console for any number of consolidated events within this period.
12. If you are creating a Filter Event rule, the next page is the Filter page. Select the
option required for the way you want Operations Manager to evaluate other rules
that match the source event. You can specify if it should add matching events to the
database or ignore them as it continues evaluating rules.
13. If you specified as the source of this rule a timed event (a regular scheduled
occurrence) or if you are creating an Alert on or Respond to Event or a Detect
Missing Event rule, the next page is the Alert Suppression page. Use the check boxes
in the list of alert fields to specify the repeated alerts that must have identical values
in order for Operations Manager to ignore (suppress) them.
14. Click Next. If you are not creating a Consolidate Similar Events or Collect Specific
Events rule, the next page is the Responses page. Here, you specify the actions
Operations Manager should perform when a matching event occurs. If you do not
specify any response, Operations Manager simply generates an alert (provided you
have specified this on the Alerts page), and changes the state displayed in the
Operator Console. You can specify the following types of response:
◦ Launch a Script. This opens a dialog box where you select an existing Operations
Manager script or create a new script. You also specify whether the script should
run on the remote computer (where the Operations Manager agent resides) or
on the Operations Manager management server, the script timeout, and any
parameters required by the script.
◦ Send an SNMP trap. This opens a dialog box where you choose whether to
generate the trap on the remote computer that raised the alert (SNMP must be
installed and enabled there) or on the Operations Manager management server.
◦ Send a Notification to a Notification Group. This opens a multi-tabbed dialog
box. On the Notification tab, select an existing notification group, modify an
existing notification group, or create a new notification group. On the Email
Format tab, you can accept the standard format for a notification e-mail or edit
this to create a custom format using placeholder variables. On the Email Format
tab, you can accept the standard format for a pager notification message or edit
this to create a custom format using placeholder variables. On the Command
Format tab, you can accept the standard command to run another application
or batch file, or you can edit this to create a custom format using placeholder
variables.
◦ Execute a command or batch file. This opens a dialog box where you can select
the Application and/or the Command Line, and the Initial directory. You also
specify whether the command or batch file should run on the remote computer
(where the Operations Manager agent resides) or on the Operations Manager
management server and the command timeout.
◦ Update state variable. This opens a dialog box where you can add state
variables that correspond to specific actions based on the values of fields in the
source event. Click the Add button in this dialog box and select an action (such
as incrementing the value of the variable or storing the last n occurrences), and
then select the field from the source event that provides the value for this
action. You also specify whether the operation is performed on the remote
computer (where the Operations Manager agent resides) or on the Operations
Manager management server.
◦ Transfer a file. This opens a dialog box where you specify a virtual directory for
the transferred file, whether to upload or download files, and the source and
destination file names. You can use values in the source event fields to select
the appropriate file, and use the standard Windows environment variables (such
as %WINDIR%) to specify the paths.
◦ Call a method on a managed code assembly. This opens a dialog box where you
specify the assembly name and type name for the managed code assembly you
want to execute. You must also enter the method name within that assembly
you want to call, specify whether it is a Static method or an Instance method,
and provide any parameters required for the method. You also specify whether
the assembly is located on the remote computer (where the Operations
Manager agent resides) or on the Operations Manager management server and
the response timeout.
15. Click Next to display the Knowledge Base page, click Edit, and enter any company-
specific knowledge appropriate for the rule that may be useful to operators and
administrators.
16. Click Next to display the Knowledge Authoring page. Click Summary in the Sections
list at the top of the page and enter summary information for this event. Repeat the
process by clicking Causes, Resolutions, and the other available categories and
entering the appropriate information. For each entry, you can specify the GUID of
another rule that shares this entry—this reduces the duplication that may occur if
many rules require the same knowledge.
17. Click Next to display the Advanced page, where you can mark this rule as deleted,
and specify the way it will be exported within a Management Pack. For a new rule,
leave the values set to the defaults.
18. Click Next to display the General page, where you provide a name for the new rule.
By default, the rule is enabled but you can disable it using the check box in this page.
You can also allow overrides of the rule by specifying the override name.
19. Finally, click Finish to create the new rule, which appears in the Event Rules section
of the left-side tree view. If you want to immediately force the new rule (or any
updated rules) through to the Operations Manager agents on remote computers,
instead of waiting for the scheduled update cycle, right-click the Management Packs
entry in the left-hand tree view, and then click Commit Configuration Change.
By default, Operations Manager pushes rule changes to all remote agents every five
minutes. To change this value, right-click Global Settings in the left-side tree view, click
Management Server Properties, click the Rule Change Polling tab, and then select the
required value.

To create an alert rule for a rule group


1. In the left-side tree view of the Administrator Console, expand the list until Rule
Groups is visible (it is under the Management Packs entry). Expand the rule group to
which you want to add the new alert rule.
2. In the tree view, right-click Alert Rules, and then click Create Alert Rule to open the
Alert Rule Properties Wizard.
3. On the Alert Criteria page, specify any criteria required to match the event rule or
performance rule that generates this alert. You can match the alert using the values
for the alert source, the severity (such as Error or Warning), and confine the match
to alerts generated within a rule group that you select. As you enter criteria, the
Criteria description section shows a summary of these criteria.
4. If you need to apply more specific criteria, click the Advanced button to open the
Advanced Criteria dialog box, where you can select any of the fields for an alert
(such as the Description, Domain, and Owner), and specify criteria to select just the
alerts you want. You can also use the Advanced Criteria dialog box to match any
values in the alert using different conditions, such as partial string matching, regular
expressions, and numerical value order comparisons. After you construct the
required criteria combination, click Close in the Advanced Criteria dialog box. On the
Alert Criteria page, click Next.
By default, event rules will generate alerts that have the same Source and Description
values as the original event. By default, performance rules will generate alerts that
have the Source set to a combination of the Object, Counter, and Instance values from
the original performance counter, and a Description set to the Source value plus the
text "value = " and the current counter value.
5. By default, Operations Manager will always process alerts, but you can change this
behavior on the Schedule page by specifying time spans where Operations Manager
will or will not process alerts. In the drop-down list box, select Only process data
during the specified time or Process data except during the specified time, and
specify the start times and end times for the period. Then use the check boxes below
the drop-down list box to select the days of the week to which this setting applies,
and then click Next.
6. The next page is the Responses page. Here, you specify the actions Operations
Manager should perform when a matching alert occurs. If you do not specify any
response, Operations Manager simply changes the State displayed in the Operator
Console. You can specify the following types of response:
◦ Launch a Script. This opens a dialog box where you select an existing Operations
Manager script or create a new script. You also specify whether the script should
run on the remote computer (where the Operations Manager agent resides) or
on the Operations Manager management server, the script timeout, and any
parameters required by the script.
◦ Send an SNMP trap. This opens a dialog box where you specify where to
generate the trap: on the remote computer (where the Operations Manager
agent resides) or on the Operations Manager management server. You can use
SNMP responses to communicate alerts to other computers and systems that
run a wide variety of operating systems.
◦ Send a notification to a Notification Group. This opens a multi-tabbed dialog
box. On the Notification tab, select an existing notification group, modify an
existing notification group, or create a new notification group. On the Email
Format tab, you can accept the standard format for a notification e-mail or edit
this to create a custom format using placeholder variables. On the Email Format
tab, you can accept the standard format for a pager notification message or edit
this to create a custom format using placeholder variables. On the Command
Format tab, you can accept the standard command to run another application
or batch file or edit this to create a custom format using placeholder variables.
◦ Execute a command or batch file. This opens a dialog box where you can select
the Application and/or the Command Line, and the Initial directory. You also
specify if the command or batch file should run on the remote computer (where
the Operations Manager agent resides) or on the Operations Manager
management server and the command timeout.
◦ Update state variable. This opens a dialog box where you can add state
variables that correspond to specific actions based on the values of fields in the
source alert. Click the Add button in this dialog box, select an action (such as
incrementing the value of the variable, or storing the last n occurrences), and
then select the field from the source alert that provides the value for this action.
You also specify if the operation is performed on the remote computer (where
the Operations Manager Agent resides) or on the Operations Manager
Management server.
◦ Transfer a file. This opens a dialog box where you specify a virtual directory for
the transferred file, whether to upload or download files, and the source and
destination file names. You can use values in the source alert fields to select the
appropriate file, and use the standard Windows environment variables (such as
%WINDIR%) to specify the paths.
◦ Call a method on a managed code assembly. This opens a dialog box where you
specify the Assembly name and Type name for the managed code assembly you
want to execute. You must also enter the Method name within that assembly
you want to call, specify whether it is a Static or an Instance method, and
provide any Parameters required for the method. You also specify if the
assembly is located on the remote computer (where the Operations Manager
agent resides) or on the Operations Manager management server, and the
response timeout.
7. Click Next to display the Knowledge Base page, click Edit, and then enter any
company-specific knowledge appropriate for the rule that may be useful to
operators and administrators.
8. Click Next to display the Knowledge Authoring page. In the Sections list at the top of
the page, click Summary, and then enter summary information for this alert. Repeat
the process by clicking the Causes, Resolutions, and the other available categories
and entering the appropriate information. For each entry, you can specify the GUID
of another rule that shares this entry—this reduces the duplication that may occur if
many rules require the same knowledge.
9. Click Next to display the Advanced page, where you can mark this rule as deleted,
and specify the way it will be exported within a Management Pack. For a new rule,
leave the values set to the defaults.
10. Click Next to display the General page, where you provide a name for the new rule.
By default, the rule is enabled but you can disable it using the check box on this
page. You can also allow overrides of the rule by specifying the Override Name.
11. Finally, click Finish to create the new rule, which appears in the Alert Rules section
of the left-side tree view. If you want to immediately force the new rule (or any
updated rules) through to the Operations Manager agents on remote computers,
instead of waiting for the scheduled update cycle, right-click the Management Packs
entry in the left-side tree view, and then click Commit Configuration Change.
By default, Operations Manager pushes rule changes to all remote agents every five
minutes. To change this value, right-click Global Settings in the left-side tree view, click
Management Server Properties, click the Rule Change Polling tab, and then select the
required value.

To create a Performance Rule for a Rule Group


1. In the left-side tree view of the Administrator Console, expand the list until Rule
Groups visible (it is under the Management Packs entry). Expand the rule group to
which you want to add the new performance rule. If the tree-view pane is not
visible, click Customize on the View menu. In the Customize View dialog box, select
the Console tree check box, and then click OK.
2. In the tree view, right-click Performance Rules, and then click Create Performance
Rule to open the Performance Rule Type dialog box.
3. In the Performance Rule Type dialog box, select the type of rule you want to create
from the list. The rule types allow you to do the following:
◦ Sample Performance Data. This is a "measuring" rule that causes Operations
Manager to collect numeric values from the Windows performance counter or
WMI counter you specify and store them in the database for viewing and
reporting. You can also generate a response each time the rule collects a value.
◦ Compare Performance Data. This is a "threshold" rule that generates an alert
and/or a response when the samples value falls outside a specified range or
crosses a defined threshold.
4. After selecting the required rule type, click Next to open the Data Provider page.
Here, you select the source of the data in the Provider name drop-down list box. The
list includes all the standard and custom performance counters defined on the
computers where the Operations Manager agents reside, as well as the counters on
the Operations Manager server itself. Alternatively, for a Compare Performance Data
(threshold) rule, you can select an internally generated or a script-generated
(Generic) event.
5. If the data source you require is not in the drop-down list box, you can specify your
own data provider by clicking the New button and entering the data required. You
can specify an application log file, a Windows performance counter, or a WMI
numeric event. Depending on which option you choose, you see a dialog box or
wizard that allows you to specify details of the data source. You can specify a custom
provider for an application log file, any of the available performance counters on
remote computers, or details of a custom class and methods for a WMI provider.
The Data Provider dialog box also contains a Modify button that opens a dialog box
where you can change the parameters for the selected performance counter. After
you enter information in the required Provider name and Provider type boxes, click
Next to open the Schedule page.
6. By default, Operations Manager will always process performance counters at the
frequency you specify on the Data Provider page, but you can change this behavior
on the Schedule page by specifying time spans where Operations Manager will or
will not collect counter values. In the drop-down list box, select Only process data
during the specified time or Process data except during the specified time, and
specify the start and end times for the period. Then use the check boxes below the
drop-down list box to select the days of the week to which this setting applies, and
then click Next.
7. The page you see next depends on the type of performance rule you are creating.
For a Sample Performance Data (measuring) rule, the next page is the Responses
page, discussed in step 13. If you are creating a Compare Performance Data
(threshold) rule, the next page you see is the Criteria page.
8. On the Criteria page, specify any criteria required to match the counter that provides
the values for this rule. You can match using the values of the fields for a counter:
Counter (name), Instance, Object, Source Computer, and Source Domain. You can
specify match values using the controls from instance, from computer, and from
domain on the Criteria page. As you enter criteria, the Criteria description shows a
summary of these criteria.
9. If you need to apply more specific criteria, or match on the Counter (name) or
Object fields, click the Advanced button to open the Advanced Criteria dialog box
where you can select any of the fields for a counter and specify criteria to select only
the counter you want. You can also use the Advanced Criteria dialog box to match
any values for the counter using different conditions, such as partial string matching,
regular expressions, and numerical value order comparisons. After you construct the
required criteria combination, click Close in the Advanced Criteria dialog box. On the
Criteria page, click Next to show the Threshold page.
10. On the Threshold page, specify the conditions under which the values collected from
the counter will raise an alert or cause a response action to occur. In the Threshold
value section of this page, select an option so that Operations Manager uses just the
current value of the counter, the average value of a specified number of samples, or
the change in the value over a specified number of samples. In the Match when the
threshold meets the following condition: section of this page, select an option so
that Operations Manager will respond to a sample value that is greater than a
specified value, less than a specified value, or always respond. You can also allow
overrides of the threshold values for this rule, and specify the Override Name. Click
the Set Criteria button to specify the target computer or group, and the override
values.
An override defines a specific computer or computer group. You can create an
override that changes the settings of rules for a specific target computer or group
without having to create custom rules for that target. Overrides allow you to disable a
rule, override the threshold value of a performance threshold rule, override a script
parameter value, and override an existing override parameter in the advanced alert
severity formula. You replace a value in any of relevant the property settings for the
rule by the name of the override. The Override Criteria section of the tree view in the
Administrator Console shows the overrides you create.
11. Click Next to show the Alert page. On this page, you specify if the counters for this
rule will generate an alert (which will appear in the Operator Console) when they
cross a threshold value, and details of the alert. This complex page allows you to do
the following:
◦ Specify that a counter threshold event will generate an alert, by selecting the
Generate alert check box.
◦ Turn on alert severity condition checking, by selecting the Enable state alert
properties check box.
◦ Specify the alert severity (such as Critical Error, Warning, or Success) if you
always want to generate the same severity for this counter threshold event.
Alternatively, you can specify a series of If conditions and an Else condition so
that the severity depends on the parameter values for the counter threshold
event. This allows you to define, for example, that a particular counter threshold
event will generate a Service Unavailable condition for specific values of the
parameters, and a Success condition for other values. Click the Edit button to
enter the condition criteria.
◦ Specify the name of the person responsible for tracking and resolving the
counter threshold event as the Owner. This allows Operations Manager to direct
the alert to the appropriate administrators and operators listed in the
Notification Groups section of the Administrator Console.
◦ Specify the Resolution state for the counter threshold event. By default, this is
New, but you can set it to Assigned to a group of people such as a helpdesk,
vendors, or mark it as requiring scheduled maintenance. You can use the Global
Settings section of the Administrator Console to modify or define new
resolution states.
◦ Specify the value for the Alert source. This is the text displayed as the Source in
the Operator Console when this counter threshold event occurs. You can enter
custom text or select from any of the fields in the counter threshold event. The
default is to use the Object, Counter, and Instance field values.
◦ Specify the value for the Description. This is the text displayed as the
Description in the Operator Console when this counter threshold event occurs.
You can enter custom text, or select from any of the fields in the counter
threshold event. The default is to use the Object, Counter, and Instance field
values followed by the text "value = " and the counter value.
◦ Specify details of the role of the server in the alert process using the Server role,
Instance, Component, and Custom Fields options.
12. Click Next to show the Alert Suppression page. Use the check boxes in the list of
alert fields to specify the repeated alerts that must have identical values in order for
Operations Manager to ignore (suppress) them, and then click Next.
13. The next page you see, for both types of Performance Rule, is the Responses page.
Here, you specify the actions Operations Manager should perform for a matching
counter. You can specify the following types of response:
◦ Launch a Script. This opens a dialog box where you select an existing Operations
Manager script or create a new script. You also specify if the script should run on
the remote computer (where the Operations Manager agent resides) or on the
Operations Manager management server, the script timeout, and any
parameters required by the script.
◦ Execute a command or batch file. This opens a dialog box where you can specify
the Application and/or the Command Line, and the Initial directory. You also
specify if the command or batch file should run on the remote computer (where
the Operations Manager agent resides) or on the Operations Manager
management server, and the command timeout.
◦ Update a state variable. This opens a dialog box where you can add state
variables that correspond to specific actions based on the values of fields for the
counter. Click the Add button in this dialog box and select an action (such as
incrementing the value of the variable or storing the last n occurrences), and
then select the field from the source counter that provides the value for this
action. You also specify if the operation is performed on the remote computer
(where the Operations Manager agent resides) or on the Operations Manager
management server.
◦ Transfer a file. This opens a dialog box where you specify a virtual directory for
the transferred file, whether to upload or download files, and the source and
destination file names. You can use values in the source counter fields to select
the appropriate file, and use the standard Windows environment variables (such
as %WINDIR%) to specify the paths.
◦ Call a method on a managed code assembly. This opens a dialog box where you
specify the Assembly name and Type name for the managed code assembly you
want to execute. You must also enter the Method name within that assembly
you want to call, specify whether it is a Static or an Instance method, and
provide any Parameters required for the method. You also specify if the
assembly is located on the remote computer (where the Operations Manager
agent resides) or on the Operations Manager management server, and the
response timeout.
14. Click Next to display the Knowledge Base page, click Edit, and enter any company-
specific knowledge appropriate for the rule that may be useful to operators and
administrators.
15. Click Next to display the Knowledge Authoring page. In the Sections list at the top of
the page, click the Summary entry, and then enter summary information for this
counter. Repeat the process by clicking the Causes, Resolutions, and the other
available categories and entering the appropriate information. For each entry, you
can specify the GUID of another rule that shares this entry—this reduces the
duplication that may occur if many rules require the same knowledge.
16. Click Next to display the Advanced page, where you can mark this rule as deleted,
and specify how it will be exported within a Management Pack. For a new rule, leave
the values set to the defaults.
17. Click Next to display the General page, where you provide a name for the new rule.
The rule is enabled by default, but you can disable it using the check box in this page.
You can also allow overrides of the rule by specifying the Override Name.
18. Finally, click Finish to create the new rule, which appears in the Performance Rules
section of the left-side tree view. If you want to immediately force the new rule (or
any updated rules) through to the Operations Manager agents on remote
computers, instead of waiting for the scheduled update cycle, right-click the
Management Packs entry in the left-side tree view, and then click Commit
Configuration Change.
By default, Operations Manager pushes rule changes to all remote agents every five
minutes. To change this value, right-click Global Settings in the left-side tree view, click
Management Server Properties, click the Rule Change Polling tab, and then select the
required value.

Guidelines for Creating and Configuring a Management Pack in the


Operations Manager 2005 Administrator Console
When creating and configuring a management pack in the Operations Manager 2005
Administrator Console, you should consider the following proven practices:

• Use the management model you developed for your application to help you decide
what rules and performance counters you need to create.
• Create a top-level rule group that corresponds to the application, using a name for the
group that makes it easy to identify. You will later be able to use this top-level rule
group to expose the overall rolled-up state of the entire application. Then create child
rule groups to build a multi-level hierarchy that mirrors that of the management model,
adding the appropriate rules into each child group.
• Create only rules directly relevant to your application. Avoid duplicating rules that are
available in built-in Management Packs, such as measuring processor usage or free
memory.
• Use alerts to raise urgent issues to operations staff immediately, perhaps through e-
mail or pager.
• Take advantage of specific features of the monitoring application, such as timed events
that can provide heartbeat monitoring of remote services, or the ability to run scripts or
commands in response to alerts (for example, to query values or call a method that
provides diagnostic information, and then generates a suitable alert).
• Provide as much useful company-specific and application-specific knowledge as possible
for each rule group and rule to make problem diagnosis, resolution, and verification
easier for operators and administrators.

Editing an Operations Manager 2005 Management Pack


After creating or importing a Management Pack in Operations Manager 2005, you will typically
need to perform additional actions to fine-tune the Management Pack or respond to changes in
the operations environment and your management model.
In the case where you have imported a management pack from the MMD, changes are quite
commonly required, because the import process does not always generate the ideal
combination of rules and rule groups. For example, the MMD generates an alert that creates a
notification to members of the administration group. However, this group has no members by
default, so you may want to edit this notification, add members to the various notification
groups, or create new notification groups.
This section discusses a number of actions that may be necessary when editing an Operations
Manager 2005 Management Pack, including the following:
• Editing rule groups and subgroups
• Editing event rules, alert rules, and performance rules
• Editing computer groups and rollup rules
• Creating and editing operators, notification groups, and notifications
• Viewing and editing global settings

Editing Rule Groups and Subgroups


To edit rule groups, you must right-click the Rule Groups entry in the tree pane of the
Administrator console, expand the list of rule groups, right-click the entry for the rule group you
want to edit, and then click Properties. The Properties dialog box (see Figure 10) contains six
tabs that allow you to edit individual features and settings for this rule group.
Figure 10
The Properties dialog box for a rule group
The resulting dialog box includes the following tabs:
• General. On this page, you can edit the name, description, and version of the group. To
disable all the rules in the group, or re-enable them, clear or select the Enabled check
box.
• Knowledge Base. On this page, you can view the Knowledge Base content (the Purpose,
Features, and Configuration knowledge), and the Company Knowledge Base content for
this rule group. If you want to edit the Company Knowledge Base content, click Edit.
You cannot edit the overall Knowledge Base content on this page—you must use the
Knowledge Authoring page for this.
• Knowledge Authoring. On this page, you can edit the overall Knowledge Base content.
It displays a list of knowledge sections (Purpose, Features, and Configuration). Select a
section in this list and then edit the knowledge content for that section in the text box
in this page. When complete, click the Generate Knowledge button to create the
formatted content. To see the result, go back to the Knowledge Base page.
• Advanced. On this page, you can reconcile Management Packs imported from
Operations Manager 2000 with the changes to the way Operations Manager 2005
handles rule groups. You can also use this page to specify the way that Operations
Manager 2005 will structure rule groups when it exports them, and mark a rule group
as deleted:
◦ In the upper section of the Advanced page is a list of any child rule groups for
this rule group. Select the check box next to any that you want to mark as
deleted—these are usually groups that are also children of other parent rule
groups.
Operations Manager 2000 allows a child rule group to link to multiple parent
rule groups. Later versions of Operations Manager do not support this, so you
must mark the links between parent and child groups that violate this
condition as deleted.
◦ In the lower section of the Advanced page, select from the three options that
govern the export of this rule group. The default option is Export as a vendor
produced rule. If the rule is disabled, then do not export. If you want to include
the child group (in order to import the rules into Operations Manager 2000),
select Export as a vendor produced rule. Export rule if it is enabled or disabled.
If you want to export the rule as a modified rule, which Operations Manager will
not overwrite when importing Management Packs, select Export as a customer
created/modified rule.
◦ If you want to mark the current rule group as deleted (as opposed to marking
child groups as deleted), select the Mark this rule group as deleted check box.
• Computer Groups. On this page, you can deploy the rules in the current rule group to
one or more computer groups. Click the Add button, select an existing computer group
in the Select Item dialog box, and then click OK to deploy the rules to the selected
group. Repeat to deploy the rules to more groups. To remove the rules from a
computer group, select the group in the list on the Computer Groups page, and then
click Remove.
You can double-click the computer group in the Select Item dialog box to open the
Properties dialog box for that computer group, and use it to edit the properties of the
group. For details about editing computer groups, see the procedure To edit Computer
Groups and Rollup Rules.
• Parent Rule Groups. On this page, you can see a list of the rule groups for which the
current rule group is a child. In Operations Manager 2005, each rule group can have
only one parent, so there should be only one rule group shown. The exception is in
Management Packs imported from Operations Manager 2000, where you must use the
Advanced page to mark the relevant child groups as deleted.

After making the required changes to the properties of the rule group, click Apply or OK in the
Properties dialog box. If you want to immediately force the changes through to the Operations
Manager agents on remote computers, instead of waiting for the scheduled update cycle, right-
click the Management Packs entry in the left-side tree view, and then click Commit
Configuration Change.

By default, Operations Manager pushes rule changes to all remote agents every five minutes. To
change this value, right-click Global Settings in the left-side tree view, click Management Server
Properties, click the Rule Change Polling tab, and then select the required value.
Editing Event Rules, Alert Rules, and Performance Rules
To edit rules in the Administrator Console, you must expand the list of rule groups under the
Rule Groups entry (which is under the Management Packs entry) to show all the currently
configured groups. Then expand the group that contains the rule you want to edit, and select
the appropriate rule type (Event Rules, Alert Rules, or Performance Rules). The right window
shows a list of the rules in the selected section. Right-click the rule you want to edit, and then
click Properties (or double-click the rule).

You can search for rules that meet specific criteria if you cannot remember where a rule
resides, or if you want to find rules that have specific properties. Right-click Rule Groups (or
any group under the Rule Groups entry) in the tree view, and then click Find Rules. This opens
the Rule Search Wizard; in it, specify the criteria, such as the location, name, type, or response.
When you click Finish, a new console window appears containing all the matching groups.

The Properties dialog box for an event rule (see Figure 11) contains ten tabs that allow you to
edit individual features and settings for this rule.

Figure 11
The Properties dialog box for an event rule
The Properties dialog box for an alert rule (see Figure 12) contains seven tabs that allow you to
edit individual features and settings for this rule.

Figure 12
The Properties dialog box for an alert rule
The Properties dialog box for a performance threshold rule (see Figure 13) contains eleven tabs
that allow you to edit individual features and settings for this rule. The Properties dialog box for
a performance measuring rule is similar, but it does not have the Criteria, Threshold, Alert, and
Alert Suppression tabs.
Figure 13
The Properties dialog box for a performance threshold rule
Many of the pages in the Properties dialog box are common across the three types of rules:

• General. On this page, you can edit the name of the rule. To disable this rule, or re-
enable it, clear or select the This rule is enabled check box. If you want to override this
rule with another rule defined elsewhere, select the Enable rule-disable overrides for
this rule check box, click the Set Criteria button, and then click the Add button in the
Set Override Criteria dialog box. Select a computer or a computer group, and then
specify Enable (0) or Disable (1) in the Edit Override Criteria dialog box. This allows you
to specify whether this rule will apply to the selected computer or group.
• Data Provider. (This page is not available for an alert rule.) On this page, you can select
the source of the event or the performance counter that acts as the data source for the
rule:
◦ For an event rule, you can select a Windows Event Log, a scheduled (timed)
event, a WMI event, or a custom script event. To specify a source not in the list,
click the New button, select an event type in the Select Provider Type dialog
box, click OK, and then specify the details for this source.
◦ For a performance rule, you can select any of the performance counters
exposed by Operations Manager and the Operations Manager agent installed on
monitored computers, or a script-generated or internally generated event. To
specify a source not in the list, click the New button, select a performance
counter type in the Select Provider Type dialog box, click OK, and then specify
the details for this source. To edit the properties, such as the counter location or
synchronization, click the Modify button and edit the values as required.
• Schedule. On this page, you can specify the periods when the rule is active. By default,
the rule is active at all times. To specify the active periods, select either Only process
data during the specified time or Process data except during the specified time, select
the start and end times, and then select the check boxes for the days of the week to
which this period applies.
• Criteria. (For an alert rule, this page is labeled Alert Criteria; this page is available for all
rule types except for a performance measuring rule.) On this page, you can specify how
an event rule or a performance threshold rule matches the source event or
performance counter, or how an alert rule matches the source alert:
◦ For an event rule, you can specify the Source, ID, Type, and Description
properties of the source event. Alternatively, click the Advanced button to
specify individual criteria for matching on any of the fields of the source event,
using a range of string matching, regular expressions, and numerical order
matching operations.
◦ For a performance threshold rule, you can specify the Instance, Domain, and
Computer properties of the source counter. Alternatively, click the Advanced
button to specify individual criteria for matching on any of the fields of the
source counter, using a range of string matching, regular expressions, and
numerical order matching operations.
◦ For an alert rule, you specify the Alert source and Severity properties of the
source alert generated by an event rule or a performance rule. If you only want
to match alerts from rules in a specific rule group, select the only match alerts
generated by rules in the following groups: check box, click the Browse button,
and then select the appropriate rule group. You can also click the Advanced
button to specify individual criteria for matching on any of the fields of the
source alert, using a range of string matching, regular expressions, and
numerical order matching operations.
• Threshold. (This page is available for only a performance threshold rule.) On this page,
you can specify the way that the rule samples the counter values, and the way that it
matches the sampled values. You can specify that the rule should calculate the
Threshold value using a single counter value, the average of a specified number of
values, or a specified change in the values. You can also specify that the threshold value
must be greater than or less than a value you provide, or if it should raise an alert for all
values. Finally, you can use this page to enable an override for this rule, and specify the
overriding rule.
• Alert. (This page is not available for an alert rule or a performance measuring rule.) On
this page, you can turn on and turn off generation of an alert when this event rule or
performance rule is activated and set the properties for the alert it generates. The
controls on this page allow you to do the following:
◦ Specify whether the event or counter will generate an alert by selecting the
Generate alert check box.
◦ Turn on alert severity condition checking by selecting the Enable state alert
properties check box.
◦ Specify the Alert severity (such as Critical Error, Warning, or Success) if you
always want to generate the same severity alert for this event or counter.
Alternatively, you can specify a series of If conditions and an Else condition so
that the severity depends on the parameter values for the event or counter. This
allows you to define, for example, that a particular event or counter will
generate a Service Unavailable condition for specific values of the parameters,
and a Success condition for other values. Click the Edit button to enter the
condition criteria.
◦ Specify the name of the person responsible for tracking and resolving the alert
as the Owner. This allows Operations Manager to direct the alert to the
appropriate administrators and operators listed in the Notification Groups
section of the Administrator Console.
◦ Specify the Resolution state for the alert. By default, this is New, but you can set
it to Assigned to in order to assign it to a group of people such as a helpdesk,
C

vendors, or mark it as requiring scheduled maintenance. You can use the Global
Settings section of the Administrator Console to modify or define new
resolution states.
◦ Specify the value for the Alert source. This is the text displayed as the Source in
the Operator Console when this alert occurs. You can enter custom text or select
from any of the fields in the event or counter that causes this alert. The default
is to use the Source field value.
◦ Specify the value for the Description. This is the text displayed as the
Description in the Operator Console when this alert occurs. You can enter
custom text, or select from any of the fields in the event or counter that causes
this alert. The default is to use the Description field value.
◦ Specify details of the role of the server in the alert process using the Server role,
Instance, Component, and Customer Fields options.
Not all the controls on the Alert page are available for every type of event rule
or performance rule. Depending on the type of rule and the provider source,
some of the controls may be disabled.
• Alert Suppression. (This page is not available for an alert rule or a performance
measuring rule.) On this page, you can compound multiple events or counter samples
into a single alert; this prevents the generation of duplicate alerts for the same source
condition. Turn on alert suppression using the check box at the top of this page, and
then select the check boxes in the list of alert fields below for those that must be
identical to suppress duplicated alerts.
• Responses. On this page, you can specify the actions that should occur when the event
rule, alert rule, or performance rule is activated. Click the Add button to show a list of
the available responses and click the one you require. Alternatively, click the Edit
button to edit an existing response selected in the list, or click the Remove button to
remove the selected response. The response actions available are the following:
◦ Launch a Script. This opens a dialog box where you select an existing Operations
Manager script or create a new script. You also specify if the script should run on
the remote computer (where the Operations Manager agent resides) or on the
Operations Manager management server, the script timeout, and any
parameters required by the script.
◦ Send an SNMP trap. This opens a dialog box where you specify where to
generate the trap: on the remote computer (where the Operations Manager
agent resides) or on the Operations Manager management server. You can use
SNMP responses to communicate alerts to other computers and systems that
run a wide variety of operating systems.
◦ Send a notification to a notification group. This opens a multi-tabbed dialog
box. On the Notification tab, select an existing notification group, modify an
existing notification group, or create a new notification group. On the Email
Format tab, you can accept the standard format for a notification e-mail or edit
this to create a custom format using placeholder variables. On the Email Format
tab, you can accept the standard format for a pager notification message or edit
this to create a custom format using placeholder variables. On the Command
Format tab, you can accept the standard command to run another application
or batch file or edit this to create a custom format using placeholder variables.
◦ Execute a command or batch file. This opens a dialog box where you can specify
the Application and/or the Command Line, and the Initial directory. You also
specify if the command or batch file should run on the remote computer (where
the Operations Manager agent resides) or on the Operations Manager
management server, and the command timeout.
◦ Update state variable. This opens a dialog box where you can add state
variables that correspond to specific actions based on the values of fields for the
counter. Click the Add button in this dialog box to select an action (such as
incrementing the value of the variable or storing the last n occurrences), and
then select the field from the source counter that provides the value for this
action. You also specify if the operation is performed on the remote computer
(where the Operations Manager agent resides) or on the Operations Manager
management server.
◦ Transfer a file. This opens a dialog box where you specify a virtual directory for
the transferred file, whether to upload or download files, and the source and
destination file names. You can use values in the source counter fields to select
the appropriate file, and use the standard Windows environment variables (such
as %WINDIR%) to specify the paths.
◦ Call a method on a managed code assembly. This opens a dialog box where you
specify the Assembly name and Type name for the managed code assembly you
want to execute. You must also enter the Method name within that assembly
you want to call, specify whether it is a Static or an Instance method, and
provide any Parameters required for the method. You also specify if the
assembly is located on the remote computer (where the Operations Manager
agent resides) or on the Operations Manager management server, and the
response timeout.
• Advanced. On this page, you can specify how Operations Manager 2005 will structure
rule groups when it exports them. If you want to mark the rule as deleted, select the
Mark this rule as deleted check box. Select from the three options that govern the
export of this rule group. The default option is Export as a vendor produced rule. If the
rule is disabled, then do not export. If you want to include the child group (in order to
import the rules into Operations Manager 2000), select the Export as a vendor
produced rule. Export rule if it is enabled or disabled check box. If you want to export
the rule as a modified rule, which Operations Manager will not overwrite when
importing Management Packs, select the Export as a customer created/modified rule
check box.
• Knowledge Base. On this page, you can view the Knowledge Base content and the
Company Knowledge Base content for this rule. Click the Edit button if you want to edit
the Company Knowledge Base content. You cannot edit the overall Knowledge Base
content in this page—you must use the Knowledge Authoring page for this.
• Knowledge Authoring. On this page, you can edit the overall Knowledge Base content.
It displays a list of knowledge Sections (such as Summary, Causes, and Resolutions).
Select a section in this list and then edit the knowledge content for that section in the
text box in this page. You can also specify that each knowledge section is shared with
other rules by clicking the Share new button and entering the sharing rule ID. This
reduces duplication of content and makes updates easier. When complete, click the
Generate Knowledge button to create the formatted content. To see the result, go back
to the Knowledge Base page.

After making the required changes to the properties of the rule, click Apply or OK in the
Properties dialog box. If you want to immediately force the changes through to the Operations
Manager agents on remote computers, instead of waiting for the scheduled update cycle, right-
click the Management Packs entry in the left-side tree view, and then click Commit
Configuration Change.

By default, Operations Manager pushes rule changes to all remote agents every five minutes.
To change this value, right-click Global Settings in the left-side tree view, click Management
Server Properties, click the Rule Change Polling tab, and then select the required value.

Editing Computer Groups and Rollup Rules


To edit computer groups and rollup roles, in the Administrator Console, expand the list of
computer groups under the Computer Groups entry (which is under the Management Packs
entry) to show all of the currently configured computer groups. Right-click the entry for the
computer group you want to edit, and then click Properties to open the Properties dialog box,
as illustrated in Figure 14.
Figure 14
The Properties dialog box for a computer group
The Properties dialog box contains ten tabs that allow you to edit individual features and
settings for this computer group:

• General. On this page, you can edit the name and description for the group.
• Included Subgroups. (This page displays a list of the subgroups within this group.) On
this page, you can add and remove subgroups. Click the Add button to open the Add
Subgroup dialog box, select an existing computer group, and then click OK to move it
from its current position in the computer groups hierarchy to become a child of the
current group. To remove a subgroup from the current computer group, select it in the
list on the Included Subgroups page, and then click the Remove button.
• Included Computers. (This page displays a list of the computers within the current
computer group.) On this page, you can add a new computer to the group. Click the
Add button to open the Add Computer dialog box, which shows a list of computers that
have an Operations Manager agent installed. Select the check box next to computers in
the list that you want to add to this group, and then click OK. To add a computer that is
not listed, click New in the Add Computer dialog box, enter the domain name and
computer name, and then click OK. To remove a computer from the current computer
group, select it in the list on the Included Computers page, and then click the Remove
button.
• Excluded Computers. (This page displays a list of the computers that are always
excluded from the current computer group, even if they are listed on the Included
Computers page.) On this page, you can exclude a computer. Click the Add button to
open the same Add Computer dialog box as used on the Included Computers page
(described earlier). Alternatively, click the Search button to open the Computer dialog
box where you can specify computers to exclude using wildcard strings or regular
expressions to match on the domain name or the computer name. Select a computer
on the Excluded Computers page, and then click Edit to edit an existing computer or
Remove to remove the selected computer.
• Search for Computers. On this page, you can specify criteria that select computers to
add to this computer group. You can search for different types of computer (such as
servers, clients, and domain controllers), and use wildcard strings or regular expressions
to match on the domain name or the computer name.
• Formula. On this page, you can specify a formula that selects computers based on the
criteria entered on the Search for Computers page. You can generate the formula using
a range of attributes for the target computers, such as the IP address, subnet, operating
system, fully qualified domain name, and more. You can also use a range of operators
and string matching functions, and select from lists of other computer groups.
• State Rollup Policy. On this page, you can specify how the overall state for a computer
group will reflect the states of individual members of the group. The members can be
the subgroups included within this group and/or the individual computers in the group.
The three options on this page (see Figure 15) are the following:
◦ The worst state of any member computer or subgroup. If you select this option,
Operations Manager will set the State value displayed in the Operator Console
to that specified for the Severity for the worst of the current unresolved alerts
for the members of this group. The alert Severity states range from Success
(best) to Server Unavailable (worst). You can see a list of these states on the
Alert page of the Properties dialog box for any of your existing event rules,
performance rules, or alert rules.
◦ The worst state from the specified percentage of best states in the computer
group. If you select this option, you must specify a percentage that defines the
proportion of the group will act as the state indicator for the group. Operations
Manager will select a set of members from the group that consists of the
computers with the best health state up to the percentage you specified of the
total group membership. In other words, if there are 10 computers and you
specify 60% , Operations Manager will select the six members of the group that
C C

currently have the least severe state. It then uses the worst (the most severe)
state of the subset it selects as the overall (rolled-up) state for the group, and
displays this in the Operator Console as the State value for this computer group.
◦ The best state of any member or subgroup. If you select this option, Operations
Manager will set the State value displayed in the Operator Console to that
specified for the Severity for the best of the current unresolved alerts for the
members of this group. It is unlikely that you will use this option very often,
because it effectively hides the state of most of the members of the group as
long as one member is performing correctly.
Figure 15
The State Rollup Policy page of the Properties dialog box for a computer group
• Console Scopes. (This page displays a list of the scopes where the current computer
group is used. By default, every group is a member of every scope.) On this page,
administrators can specify custom sets of computer groups for each scope (Operations
Manager Users, Operations Manager Authors, and Operations Manager Administrators)
using the Console Scopes options within the main Administration section of the
Administrator Console.
• Parents. This page displays a list of the parent computer groups for this group, if it is a
child (nested) group.C

• Rules. On this page, you can enable and disable the rules in this computer group and its
child subgroups. Select the check box at the top of the page to disable all the rules in
this group and all its child subgroups (if any). The Rules page also shows a list of any
rule groups associated with parent computer groups that this computer group inherits.
At the bottom of the page, a list shows the rule groups already associated with this
computer group, which its child computer groups will inherit. To add a rule group to this
list, click the Add button to open the Select Rule Group dialog box, select the required
rule group, and then click OK. To remove a rule group from the list, select it, and then
click the Remove button.

After making the required changes to the properties of the computer group, click Apply or OK in
the Properties dialog box. If you want to immediately force the changes through to the
Operations Manager agents on remote computers, instead of waiting for the scheduled update
cycle, right-click the Management Packs entry in the left-side tree view, and then click Commit
Configuration Change.

By default, Operations Manager pushes rule changes to all remote agents every five minutes.
To change this value, right-click Global Settings in the left-side tree view, click Management
Server Properties, click the Rule Change Polling tab, and then select the required value.

For details about how to create a computer group, see the later section, "Create an Operations
Manager 2005 Computer Group and Deploy the Operations Manager Agent and Rules."

Creating and Editing Operators, Notification Groups and


Notifications
Rules can create notifications that consist of e-mail or pager alerts, or they can run commands
to perform custom notification tasks. You first create notification groups and add individual
members to these groups. Then you can specify the group(s) to which a rule will send
notifications:
To create and edit operators, notification groups, and notifications
1. In the Administrator Console, expand the Notification entry (which is under the
Management Packs entry), and then expand Notification Groups entry to show all
of the currently configured notification groups. The Operators entry contains a list of
all operators configured for Operations Manager 2005. The Notification Groups
entry contains a list of all configured Notification Groups. Select a group entry in the
left-side tree view to display a list of the operators within that group in the right
pane. Select the Notification entry to see a summary of the number of configured
operators and groups and links to view and create operators and groups (see Figure
16).
Figure 16
The Notification and Notification Groups section of the Administrator Console
2. To create a new operator, click the Create Operator link in the right pane, or right-
click Operators in the left pane, and then click Create Operator. In the Operator
Properties Wizard that opens, enter the name of the operator and specify whether
this operator is enabled by selecting or clearing the Enabled check box, and then
click Next to show the Email page.
3. On the Email page, select the Email this operator check box if you want Operations
Manager to send email alerts to this operator. Enter the e-mail address, and then
select either Always email this operator (to send e-mail messages at any time) or
Email this operator at the specified times. If you select the second option, enter the
start and end times for the period, and then select the days of the week to which it
applies. Click Next to show the Page(r) page.
4. On the Page(r) page, select the Page this operator check box if you want Operations
Manager to send pager alerts to this operator. Then enter the pager address, and
then select either Always page this operator (to send pager alerts at any time) or
Page this operator at the specified times. If you select the second option, enter the
start and end times for the period, and then select the days of the week to which it
applies. Click Next to show the Command page.
5. On the Command page, select the Notify this operator by external command check
box if you want Operations Manager to alert this operator by running an external
command that you specify in the Global Settings for Operations Manager 2005. You
must enter an operator ID string value that is passed to the command. For details
about editing the Global Settings, see "Viewing and Edit Global Settings" later in this
chapter Click Finish to create the new operator.
6. To edit the properties for an existing operator, select the Operators entry in the left-
side tree view. In the list of operators in the right pane, right-click the one you want
to edit, and then click Properties. The Properties dialog box has the following tabs:
◦ General. On this page, you can edit the operator name and enable or disable
this operator.
◦ Email. On this page, you can enable or disable the sending of email alerts to this
operator, edit the e-mail address, and specify the periods when e-mail alerts can
be sent.
◦ Page(r). On this page, you can enable or disable the sending of pager alerts to
this operator, edit the pager address, and specify the periods when pager alerts
can be sent.
◦ Command. On this page, you can enable or disable the execution of an external
command that sends alerts to this operator, edit the operator ID that is passed
to the command, and specify the periods when commands can be executed.
◦ Notification Groups. (This page displays a list of the groups of which this
operator is a member.) On this page, you can add this operator to another
group. Click the Add button to open the Notification Groups dialog box, select a
group, and then click OK. To remove this operator from a notification group,
select the group in the list on the Notification Groups page, and then click the
Remove button.
7. To delete an existing operator, right-click it in the right pane, and then click Delete.
Click Yes in the confirmation dialog box that appears.
8. To create a new notification group, right-click Notification Groups in the left-side
tree view, and then click Create Notification Group. In the Notification Group
Properties dialog box that opens, enter the name for the new group.
9. To add members to the new group, select them in the right-side list of Available
operators and click the "<-" button. To create a new operator to add to the group,
click the New Operator button to start the Operator Properties Wizard and follow
the steps shown earlier in this procedure (steps 2 through 5). After adding or
creating all the required operators, click Finish in the Notification Group Properties
dialog box.
10. To edit an existing notification group, right-click it in the left-side tree view, and then
click Properties to open the Notification Group Properties dialog box. Edit the name
of the group in the text box at the top of the dialog box, and edit the list of members
for the group by selecting them in the lists and clicking the "<-" and "->" buttons.
11. To have Operations Manager generate a notification when an event, performance
counter threshold, or alert occurs, you specify one or more notification groups in the
properties of that rule. Select the required rule from the Event Rule, Performance
Rule, or Alert Rule section of the appropriate rule group, right-click it, and then click
Properties. For details about the properties of a rule, see the earlier section, "To edit
Event Rules, Alert Rules, and Performance Rules"
12. In the Properties dialog box for the rule, open the Responses page, and then click
the Add button. Select Send a notification to a Notification Group, and then select
the group from the drop-down list box. Repeat to add more notification groups as
required, and then click OK to close the rule Properties dialog box.

Viewing and Editing Global Settings


Many of the features in Operations Manager 2005 depend on configuration settings and
properties defined as global settings. To view global settings in the Administrator Console,
expand the Administration entry, and then select Global Settings. The right pane shows the
global settings for Operations Manager 2005. Although there are thirteen entries for the global
settings, there are only three different Properties dialog boxes. You can open the main global
settings Properties dialog box by double-clicking (or right-clicking and then clicking Properties)
any of the entries except Management Servers and Agents. The main Properties dialog box is
shown in Figure 17.

Figure 17
The main Properties dialog box for the Global Settings in Operations Manager 2005
The main properties dialog box contains eleven tabbed pages:

• Notification Command Format. On this page, you can specify a custom application that
you want to execute in response to an operator alert. You can specify the command line
for the application and include placeholders that Operations Manager replaces with
values when it executes the command. These placeholders include the Operator ID you
specify in the Properties dialog box for each operator.
• Knowledge Base Template. On this page, you can edit the HTML template Operations
Manager uses to generate the multi-section knowledge base content for items in a
Management Pack. The template contains placeholders of the form <!section-name>
that indicate where Operations Manager will insert the separate sections of knowledge
content text.
• Database Grooming. On this page, you can specify how Operations Manager will
automatically mark alerts as resolved after a certain period, removing them from the
Operator Console display.
• Operational Data Reports. On this page, you can automatically send to Microsoft
reports about the way you use Operations Manager 2005; this provides valuable
feedback to the development team about typical usage patterns.
• Custom Alert Fields. On this page, you can change the names of the five custom fields
displayed in alerts. You can use these if you want to add application-specific or
company-specific information to every alert.
• Alert Resolution States. On this page, you can modify the existing alert resolution
states or add new ones. The default states include Acknowledged, Assigned to xxx, and
Resolved. You can also specify the service level interval within which each state should
be resolved, the shortcut key assigned to this state, and whether users can set the state
within the Operator Console and the Web Console.
• Email Server. On this page, you can configure the settings used to send e-mail alerts
through your SMTP mail server.
• Licenses. On this page, you can manage the number of management licenses for
remote managed clients.
• Web Addresses. On this page, you can specify the URL of the Operations Manager Web
Console and the URL used for online product knowledge (the default is the Microsoft
Support Web site). You can also specify custom Web addresses for your file server (for
transferring files to clients), and for your company knowledge base.
• Communications. On this page, you can specify the port Operations Manager uses for
encrypted communication with remote managed computers.
• Security. On this page, you can specify features of the authentication and the response
execution for the Operations Manager server and remote managed computers.

After making the required changes to the global settings, click Apply or OK in the Properties
dialog box.
In the Management Servers Properties dialog box and the Agents Properties dialog box, you
can fine-tune the behavior of the Operations Manager server and the Operations Manager
agents installed on the Operations Manager server and on remote computers. You will usually
not need to change these settings, and this chapter does not describe them in detail. For more
information, examine the Operations Manager 2005 Help file or click the Help button in the
relevant Properties dialog box.

After you finish editing your Management Pack(s) and settings, you can turn off
Authoring mode. Right-click Rule Groups in the left-side tree view, click Disable
Authoring mode, and then click Yes in the confirmation dialog box.

Guidelines for Editing an Operations Manager 2005 Management


Pack
When editing an Operations Manager 2005 Management pack, consider the following proven
practices:
• Ensure that your Management Pack contains the appropriate rule groups and rules to
match the management model and the instrumentation exposed by the application.
Keep the hierarchy of the rule groups and the properties of the rules as simple as
possible, while following the structure and the requirements of the management
model.
• Ensure that your Management Pack contains the appropriate computer groups and
subgroups that match the physical layout of the computers that will run the application.
Use subgroups to provide the appropriate rollup features for each subset of computers
that run each component or section of the application.
• Create the appropriate notification groups containing operators that manage the
application and other people that have an interest in its operation (such as business
owners and application developers). Configure responses for the rules and alerts that
send operational alerts to the appropriate groups.
• Modify any of the global settings that affect your application. For example, you may
want to use the custom alerts fields for company-specific information or modify the
alert resolution states and service level periods to suit your requirements.

Create an Operations Manager 2005 Computer Group and Deploy


the Operations Manager Agent and Rules
When creating or editing a Management Pack for Microsoft Operations Manager (or the
equivalent sets of rules and knowledge for other monitoring applications and environments), it
is sometimes hard to relate the architecture of a management model with the features provided
by the monitoring application. For example, a management model as defined in the MMD tool
uses a hierarchical structure of application components and services, using a simple three-state
(RED, YELLOW, and GREEN) indicator paradigm for the health of each section. Most monitoring
applications provide a wide range of features, but they do not relate directly to this simple
approach.
To match the management model to the capabilities of the monitoring application, you can
create groups of computers that perform similar or related tasks, and then combine these
groups in a hierarchical way that mirrors the structure of the management model. Each group
exposes a rolled-up state indication that depends on the state of its members, according to the
rules contained in the management model.
The set of rules specified in the management model for each component or section of the
application, implemented as a rule group in the monitoring application, then corresponds to a
group of computers, and you can associate and deploy the appropriate set of rules to each
group.
This section contains the following procedures:
• To deploy the Operations Manager agent to remote computers
• To create a computer group
• To associate a rule group with a computer group and deploy the rules

To deploy the Operations Manager agent to remote computers


1. In the left-side tree view of the Administrator Console, expand the list to show the
Computers entry (which is under the Administration entry). If the tree-view pane is
not visible, click Customize on the View menu. In the Customize View dialog box,
select the Console tree check box, and then click OK.
2. Right-click Computers, and then click Install/Uninstall Agents Wizard. Alternatively,
you can click the Install/Uninstall Agents Wizard link in the right pane of the
Administrator Console after selecting the Computers entry in the tree view.
3. Click Next on the introduction page of the wizard. On the next page, select the
Install Agents option (you can also use this wizard to remove installed agents from
specific computers by selecting the Uninstall Agents option).
4. Click Next, and then select one of the following two options for discovering
computers:
◦ Browse for or type in specific computer names. If you select this option, the
next page you see displays an empty list and a Browse button that opens the
Select Computers dialog box. The Select Computers dialog box is the same as
you use to find objects in Active Directory or within a domain (see Figure 18).
You can use the Advanced button in this dialog box to search for computers
based on a range or criteria and conditional matching methods.
Figure 18
Selecting computers from the domain using the Select Computers dialog box
◦ Search criteria. If you select this option, the next page you see displays an
empty list for the discovery rules. Click Add to open the Computer Discovery
Rule dialog box. In this dialog box, you specify the Domain name containing the
computers you want to discover, a condition and a text string for the Computer
name, and select a Computer type. You can use partial string matches, including
wildcards, or a regular expression to match the computer name, and specify that
the computer type is a server, a client, or accept both servers and clients. You
can also apply the discovery rule to domain controllers if you want. By default,
the wizard will contact each computer in turn to verify that it exists, though you
can disable this feature using the check box at the bottom of the Computer
Discovery Rule dialog box. After creating a discovery rule, click OK to add it to
the list of rules in the wizard, and repeat the process to add more rules as
required.
5. Click Next, and then specify the account the wizard will use to install the agents. The
default is to use the Management Server Action Account created when you installed
Operations Manager. However, if required, you can select Other, and then specify a
user name and a password for the account you want to use.
6. Click Next, and then specify the Agent Action Account. This is the account that the
agent will run under on the remote computers. The default is the Local System
account. However, if required, you can select Other and specify a user name and a
password for the account you want to use.
7. Click Next, and then specify the folder on the remote computers where the wizard
will install the agent. The default is a subfolder of the Program Files directory,
though you can select other environment variables (such as %SYSTEMDRIVE% and
%PROGRAMFILES%) in the drop-down list box, and then add a custom path if
required.
8. Click Next to see a summary of the actions the wizard will perform. By default, the
wizard will display the progress of each action, though you can clear the check box
on this page to prevent this if you prefer. Click Finish to install the agents.
9. After the wizard completes, you can use the links in the right pane of the
Administrator Console or the entries below the Computers entry in the tree view to
see a list of the computers that have the agent installed (see Figure 19).
Figure 19
The Computers page shows the installed agents and a link to install/uninstall the agent

To create a computer group


1. In the left-side tree view of the Administrator Console, expand the list to show the
Computer Groups entry (which is under the Management Packs entry). If the tree-
view pane is not visible, click Customize on the View menu. In the Customize View
dialog box, select the Console tree check box, and then click OK.
2. If this is the first computer group for your application, you must create a top-level
(parent) group (you can add child groups to this group as you create it if required).
To create a top-level computer group, right-click Computer Groups in the tree view,
and then click Create Computer Group to start the wizard.
3. If you have already created a top-level computer group for your application, you can
create nested (child) groups within that top-level group. To create a child computer
group, right-click the top-level group entry, and then click Create Computer Group
to start the wizard.
4. Click Next to open the General page. Enter a name for the computer group, and then
enter a description that will help operators to identify the purpose of this group.
5. Click Next to open the Included Subgroups page. If you want to add existing groups
as children of the new group (subgroups), click the Add button to open the Add
Subgroup dialog box. Select the check box for each existing group you want to add,
and then click OK. The wizard displays the new group and the subgroups you
selected.
6. Click Next to open the Included Computers page. Click Add to open the Add
Computer dialog box. This shows the Windows domain that contains the Operations
Manager management server and all computers discovered in that domain
(computers that have the Operations Manager agent installed). Select the check box
next to the domain name to include all listed computers or select the check boxes
next to individual computers you want to include. To add computers not already
listed, click New, enter the domain name and computer name, and then click OK.
After you select the computers you want to include, click OK to close the Add
Computer dialog box. The wizard displays a list of the computers you selected.
7. Click Next to open the Excluded Computers page. A computer group can include all
the computers in a domain or computers found using a search or formula process (as
you will see later). You can exclude specific computers on the Excluded Computers
page by clicking the Add button to open the Add Computer dialog box or search for
computers to exclude by clicking the Search button to open the Computer dialog
box. The Computer dialog box allows you to select computers using a range of
criteria, such as partial and full string matching on the name, wild-card string
matching, and regular expressions.
8. Click Next to open the Search for Computers page. This page allows you to search for
computers to add to the group based on their function (such as server, client, or
domain controller), or based on the name using similar options as in the Computer
dialog box discussed in the previous step of this procedure. If you do not want to add
more computers to this group, select the Do not search for computers option.
If you want to select computers based on a formula in the next step of this procedure,
you must provide the relevant criteria on the Search for Computers page of the wizard.
9. Click Next to open the Formula page. Here, you can specify a formula that selects
computers based on the criteria you entered in the previous page. You can generate
the formula using a range of attributes for the target computers, such as the IP
address, subnet, operating system, fully qualified domain name, and more. You can
also use a range of operators and string matching functions, and select from lists of
other computer groups. If you do not want to add more computers to this group,
select the Do not use a formula to determine membership for this computer group
option.
You can use a custom registry key located on the remote computer as an attribute
instead of one of the existing attributes created by Operations Manager and the
Management Packs you install. Click Attribute, click New on the Set Attribute page,
and then specify details of the registry key and value as required.
10. Click Next to open the State Rollup Policy page. Here, you specify how the overall
state for a computer group will reflect the states of individual members of the group.
The members can be the subgroups included within this group and/or the individual
computers in the group. The three options on this page (see Figure 20) are the
following:
◦ The worst state of any member computer or subgroup. If you select this option,
Operations Manager will set the State value displayed in the Operator Console
to that specified for the Severity for the worst of the current unresolved alerts
for the members of this group. The alert Severity states range from Success
(best) to Server Unavailable (worst). You can see a list of these states in the
Alert page of the Properties dialog box for any of your existing event rules,
performance rules, or alert rules.
◦ The worst state from the specified percentage of best states in the computer
group. If you select this option, you must specify a percentage that defines the
proportion of the group will act as the state indicator for the group. Operations
Manager will select a set of members from the group that consists of the
computers with the best health state up to the percentage you specified of the
total group membership. In other words, if there are 10 computers and you
specify 60%, Operations Manager will select the six members of the group that
currently have the least severe state. It then uses the worst (the most severe)
state of the subset it selects as the overall (rolled-up) state for the group, and
displays this in the Operator Console as the State value for this computer group.
◦ The best state of any member or subgroup. If you select this option, Operations
Manager will set the State value displayed in the Operator Console to that
specified for the Severity for the best of the current unresolved alerts for the
members of this group. It is unlikely that you will use this option very often,
because it effectively hides the state of most of the members of the group as
long as one member is performing correctly.

Figure 20
Specifying the State Rollup Policy for a computer group
11. Click Next to open the Confirmation page, which provides a summary of the options
you have set in the wizard. To change any settings, click the Back button to return to
the relevant page.
12. If you are happy with the settings shown, click Next, and then click Finish. The new
computer group appears in the Administrator Console tree view. If you specified any
existing groups as subgroups of the new group, they move to appear under the new
group in the tree view.
To associate a rule group with a computer group and deploy the rules
1. In the left-side tree view of the Administrator Console, expand the list to show the
Rule Groups entry (which is under the Management Packs entry). If the tree-view
pane is not visible, click Customize on the View menu. In the Customize View dialog
box, select the Console tree check box, and then click OK.
2. Expand the list of rule groups, and then right-click the group of rules you want to
deploy to a specific set of computers. On the shortcut menu, click Associate with
Computer Group to open the Properties dialog box for this rule group with the
Computer Groups page selected. Alternatively, you can right-click the rule group,
select Properties, and then select the Computer Groups tab.
3. On the Computer Groups page, click Add to open the Select Item dialog box. Select
the computer group to which you want to deploy the rules in this rule group, and
then click OK. Repeat the process if you want to deploy the rules to more than one
computer group.
4. Back in the Properties dialog box for the rule group, click OK.
5. If you want to immediately force the rules in this rule group through to the
Operations Manager agents on remote computers, instead of waiting for the
scheduled update cycle, right-click Management Packs in the left-side tree view, and
then click Commit Configuration Change.
By default, Operations Manager pushes rule changes to all remote agents every five
minutes. To change this value, right-click Global Settings in the left-side tree view, click
Management Server Properties, click the Rule Change Polling tab, and then click
select the required value.

Guidelines for Creating an Operations Manager 2005 Computer


Group and Deploying the Operations Manager Agent and Rules
When creating an Operations Manager 2005 computer group and deploying the Operations
Manager agent and rules, you should consider the following proven practices:

• Create a top-level computer group that includes all the computers that will execute the
application, and which you want to monitor. If the application has distinct separate
sections, such as separate Web services running on different computers or separate
groups of servers that may be in use at different times, create separate child rule
groups for each set of computers within a parent (top-level) rule group.
• Use the state rollup options for the top-level computer group to specify the overall
state for all the computers involved in the application, so the console displays the
appropriate state indication to operators. Use the appropriate severity settings for each
rule to represent the three basic states RED ("failed" or "unavailable"), YELLOW
("degraded"), and GREEN ("working normally" or "available").
• Combine the state of each subgroup using the same approach as for the top-level
group, so operators can drill down, monitor, and see the state of individual components
or sections of the application. This makes diagnosis of problems easier.

View Management Information in Operations Manager 2005


After you import or create a Management Pack for your application, you can use it to monitor
your application. You will also usually take advantage of existing Management Packs, provided
with Microsoft Operations Manager 2005 or downloaded from TechNet at
http://www.microsoft.com/technet/prodtechnol/mom/mom2005/catalog.aspx . These
H H

additional Management Packs allow you to detect faults in the underlying infrastructure, such as
performance degradation or core operating system service failures, and monitor services such
as Microsoft Exchange and SQL Server.
This section includes procedures for using both the Operator Console and the Web Console. The
Operator Console allows you to view the state of an application and drill down to see details of
the events, alerts, performance counters, and computers that run the application. The Web
Console has less functionality, but it can still be of great use to operators, particularly when the
Operator Console is not installed.
To view state information, alerts, events, and computers in Operations Manager 2005
using the Operator Console
1. Open the Operations Manager 2005 Operator Console, and use the Group: drop-
down list at the top of the window to select the computer group for which you want
to view information. Click the State link in Navigation pane at the lower-right section
of the window to show the overall health state for the application you selected in
the Group: drop-down list (See Figure 21).
Figure 21
The State view of an application in the Operations Manager 2005 Operator Console

If you cannot see all of the panes shown in Figure 21, on the View menu, select the
pane you want to open (Navigation Pane or Detail Pane).
2. Figure 21 indicates that the overall state for this computer group (all the computers
running this application) is Critical Error. The lower section of the window shows the
computers in this computer group (in this case, there is only one), and indicates the
total number of open or unresolved alerts, and the total number of events.
3. Click the Alerts link in the navigation pane to see all the open alerts for the computer
group. The lower window now shows details of the selected alert, including the
properties (field values) of the event or counter threshold that caused the alert (see
Figure 22). The Alert Details section in the lower window also displays the product-
specific and company-specific knowledge for the rule that detected the problem.
This knowledge assists in diagnosing, resolving, and verifying resolution of the
problem that originally caused the alert (see Figure 23).
Figure 22
The list of all alerts for the computer group and details about the selected alert
Figure 23
Viewing the product knowledge for the alert

4. To view only the alerts for a specific computer within the computer group, go back
to the State view and double-click an alert in the State upper window for the
computer you want to view, or double-click the computer in the State Details lower
window. You see the same view as in Figure 23, but it contains a list of only the
alerts for the selected computer.
5. Click the Events link in the navigation pane to see all the events from the Windows
Event Log for computers in the computer group. The list shows the domain and
computer names, and the lower window contains the values of the event fields for
the event selected. You can view a list of alerts raised by this event on the Alerts
tabbed page in the lower window, and the parameters of the event on the
Parameters tabbed page (see Figure 24).
Figure 24
The list of events for all computers within the selected computer group

Right-click the upper window in any view, and then click Personalize View to select the
columns displayed in the list or to change the order of the columns.
6. Click the Performance link in the navigation pane to see a list of all the computers
within the currently selected Group: scope. Select a computer in the list, and then
click the Select Counters button to display a list of all the performance counters for
that computer. This includes the standard operating system counters implemented
by the built-in Management Packs in Operations Manager 2005, such as processor
usage and elapsed time (see Figure 25).
Figure 25
Selecting a performance counter to view

7. Select the check boxes next to the counters you want to view results for, and then
click the Draw Graph button. In Figure 26, you can see the results for the
WSTransport Service counter implemented in an example application.
Figure 26
A chart showing performance counter data samples collected by Operations Manager

8. Click the Computers and Groups link in the navigation pane to see a list of all the
subgroups within the current group (the group selected in the Group: drop-down list
at the top of the window) and the state of each one. Double-click a subgroup to
navigate to that group and view the state and details of the group.
9. Click the Diagram link in the navigation pane to see a schematic diagram of the
current computer group, its subgroups, and the computers within each group. It also
displays the current health state of each group and computer (see Figure 27). This
makes it easy for operators to grasp visually the overall state of the application and
the individual components.
Figure 27
A computer group in Diagram view showing the state of each computer

10. Double-click a computer (not a computer group) in the right window in Diagram
view to switch to Alerts view for that computer.
11. You can use the My Views link in the navigation pane to create custom views of the
monitoring information. You can also define custom Public Views for viewing in the
Operator Console using the Console Scopes section within the Administration
section of the Administrator Console. For more details, see the Operations Manager
Help file.

Operations Manager 2005 also installs a Web-based Operator Console. While this has fewer
features, it can be used for remote monitoring and problem diagnosis from locations outside
your own network.
To use the Web Console for remote monitoring and problem diagnosis
1. To open the Web Console from the Administrator Console, select the Microsoft
Operations Manager entry directly below the console root in the left-side tree view.
On the right-side Home page, click the Start Web Console link in the Operations
section of the page.
2. To discover the URL of the Web Console, expand the Administration section of the
left-side tree view in the Administrator Console, and then select the Global Settings
entry. Double-click Web Addresses in the right window to see the Web Console
Address. This is, by default, a non-standard port on the local computer, such as
http://machine-name:1272. Enter this URL into your Web browser.

The Web Console provides three views of the monitoring information: Alerts, Computers, and
Events. These are very similar to the views you see in the Operator Console. For example, Figure
28 shows the Alerts view in the Web Console. You can select an alert and view the properties,
events, knowledge, and history just as you can in the Operator Console.

Figure 28
The Alerts view in the Operations Manager 2005 Web Console showing the product knowledge

Guidelines for Viewing Management Information in Operations


Manager 2005
When viewing management information in Operations Manager 2005, consider the following
proven practices:
• If you connect directly to the management domain, use the Operator Console to
monitor applications and computers. If you connect from a remote location over the
Internet or an intranet, use the Web Console to monitor applications and computers.
• Use the drop-down Groups: list to limit your view to the appropriate computer group
and its subgroups, unless you want to see alerts raised by all the managed computers
for all events.
• Use the State view and the Diagram view to provide an overall picture of the health
state of the application. In Diagram view, you can also see the state of the subgroups
and individual computers.
• Use the Alerts view to obtain a list of alerts, generally sorted by descending severity,
which is useful in prioritizing diagnosis and resolution requirements, and the
corresponding actions.
• Use the Events view to see the details of source events, and use the Performance view
to see the values and history of performance counter samples. Both are useful in
diagnosing problems and verifying resolution.
• Use the Administrator Console to create custom views if you want to restrict the
features of the Operator Console available to specific groups of users, or to all users.

Create Management Reports in Operations Manager 2005


Regular and continuous monitoring makes it easier to detect application failures, problems, or
unsatisfactory performance, but the actions taken by administrators and operations staff are
usually short-term in nature. They tend to concentrate on the present, and they may not take
into account historical events and performance over longer periods that indicate fundamental
issues or causes.
However, business owners and hosting services must often conform to specified service level
agreements (SLAs) on performance and availability. The data required to support these
agreements only appears over longer periods and requires access to historical information.
Data captured in summary reports can also be vital to operations staff in detecting missed
computers, or incorrectly configured application or computer groups, particularly in large and
complex installations. These reports may be the only way that operations staff can match
monitoring infrastructure to the physical hardware.
Microsoft Operations Manager 2005 includes a report generator that uses SQL Server Reporting
Services to publish performance and error details it collects while monitoring applications. This
can provide useful summaries of application performance, and the history of issues encountered
with an application. You can use the reports to view the overall performance over time and
detect specific problem areas within your application.
To view monitoring and management reports in Operations Manager 2005
1. Start the Reporting Console from the Microsoft Operations Manager 2005 section of
your Start menu. Alternatively, unless you specified a different location when
installing SQL Server Reporting Services, you can open the SQL Server Reporting
Console in a Web browser by entering the address http://localhost/Reports.
The Reporting Console and Web Console are optional features that you must install
when you install Operations Manager 2005. If you encounter problems when opening
the reports, check that the SQL Server Reporting Services service is running.
2. When prompted, enter the user name and password of an account that has
permission to access the Operations Manager reporting data in SQL Server Reporting
Services. Usually this is an administrator-level account for the monitoring domain.
3. If you opened the Reporting Console from your Start menu, you will see options to
view reports for Microsoft Operations Manager, Operational Data Reporting, and
Operational Health Analysis. If you opened SQL Server Reporting Services in your
browser, you must click the Microsoft Operations Manager Reporting link in the
SQL Report Manager Home page to get to this menu.
4. Click the Microsoft Operations Manager link to see a menu of the operational
reports. These include details of the Operations Manager agents installed on the
management server and the remote computers and a summary of the health and
performance of the management group.
5. Select a report and enter the criteria in the controls at the top of the report page,
then click the View Report button. For example, Figure 29 shows the Management
Group Agents report, with the Management Group: drop-down set to the name of
the required management group.

Figure 29
Viewing the Management Group Agents report for an Operations Manager management group

The other two links on the Microsoft Operations Manager Reporting page open submenus
containing a range of other pre-defined reports. The Operational Data Reporting page contains
links to view all alerts and events, as well as the general health and a report listing any script or
response errors.
The Operational Health Analysis page contains a number of more detailed reports that drill
down into the operational history of the management group. These include analysis of alerts,
events, and performance by type, severity, time, frequency, and computer group. You can also
view reports on the association between rule groups, computer groups, and individual
computers.

Guidelines for Creating Management Reports in Operations


Manager 2005
When creating management reports in Operations Manager 2005, consider the following
proven practices:
• Use the Operations Manager Reporting Console to view the historical performance of
an application to ensure it performs within the service level agreements or the
parameters defined by business rules.
• Use the reports to discover inconsistencies in performance, check overall reliability, and
detect problematic situations such as unreliable networks—and the times when these
issues most commonly arise.
• Use the reports to confirm management coverage of the computers running the
application and deployment of the appropriate sets of rules to each group.

Summary
Management Packs can be a very useful tool for the operations team in managing applications.
This chapter demonstrated how to create and import Management Packs in Operations
Manager 2005 and then showed how to edit the Management Packs to provide the functionality
required when monitoring an application.
Chapter 17
Creating and Using System Center
Operations Manager 2007
Management Packs
Chapter 16 of this guide described creating and authoring Management Packs in Microsoft
Operations Manager 2005. This chapter describes how to perform the same tasks using System
Center Operations Manager 2007. This chapter discusses the same scenarios for creating and
using Management Packs. It describes in detail the following:

• Converting and importing a management model from Operations Manager 2005


• Creating a management pack in the Operations Manager 2007 Operations Console
• Editing an Operations Manager 2007 Management Pack
• Viewing management information in Operations Manager 2007
• Creating management reports in Operations Manager 2007

The Transport Order application is used as a running example throughout this chapter. This
application forms part of the shipping solution in the Northern Electronics worked example
used throughout this guide.

Convert and Import a Microsoft Operations Manager 2005


Management Pack into Operations Manager 2007
The format of Management Packs differs between Microsoft Operations Manager 2005 and the
System Center Operations Manager 2007. Therefore, you cannot import Microsoft Operations
Manager 2005 Management Packs directly into Operations Manager 2007. Instead, you must
convert these to the appropriate format, or recreate them using the Operations Manager 2007
tools.
To convert and import a Microsoft Operations Manager 2005 Management Pack for
Operations Manager 2007
1. Obtain the Microsoft Operations Manager 2005 Management Pack (.akm file) that
contains the rules, alerts, notifications, and computer groups you want to implement
in Operations Manager 2007 by doing the following:
◦ Export a management model from your management model designer (such as
the Microsoft Management Model Designer) as a Microsoft Operations Manager
2005 Management Pack.
◦ Export an existing Management Pack from Microsoft Operations Manager 2005
using the Management Pack Export Wizard.
◦ Acquiring the appropriate Management Pack from a third-party provider.
2. Copy the .akm file into the folder where you installed Operations Manager 2007. By
default, this is %ProgramFiles%\System Center Operations Manager 2007\.
3. Open a Command window from your Start menu, navigate to the Operations
Manager 2007 folder where you placed the .akm file, and use the MP2XML tool to
convert the Microsoft Operations Manager 2005 Management Pack to an
Operations Manager 2007-compatible XML file. The syntax is the following:
mp2xml [folder_name\]source_file.akm [destination_folder\]
destination_file.xml

4. Use the MPConvert tool to convert the XML file into an Operations Manager 2007
Management Pack file. The syntax is the following:
mpconvert [folder_name\]source_file.xml [destination_folder\]
new_filename.xml

5. On the taskbar, click Start, point to System Center Operations Manager 2007, and
then click Operations Console. In the navigation pane, click the Administration
button. If the navigation pane is not visible, click Navigation Pane on the View
menu.
6. In the left-side tree view, right-click Administration (at the top of the tree), and the
click Import Management Pack(s).
7. In the Select Management Pack(s) to import dialog box, select the .mp or .xml file
for the Management Pack you want to import. You can hold down the SHIFT or CTRL
keys while clicking to select more than one file.
Files with the .mp file name extension are Sealed Management Packs that you cannot
edit. Files with the .xml file name extension are Unsealed Management Packs that you
can edit.
8. Operations Manager imports the Management Packs you selected and installs them.
A dialog box reports the results, indicating any that it cannot import. When you close
the dialog box, Operations Manager 2007 begins monitoring; it collects the same
data as in Microsoft Operations Manager 2005.

Guidelines for Converting and Importing a Microsoft Operations


Manager 2005 Management Pack into Operations Manager 2007
When converting and importing a Microsoft Operations Manager 2005 Management Pack into
Operations Manager 2007, you should consider the following guidelines:
• Ensure that you maintain all current information in your management model, so you
can export it from your management model editor and it is ready to use in Operations
Manager 2007.
• Apply any changes you make to the application and the monitoring environment
following feedback or run-time experience, especially where this affects the
instrumentation of the application.
• If you are already using Microsoft Operations Manager 2005, back up your
Management Packs, by exporting them using the Export Wizard in Microsoft Operations
Manager 2005, whenever you make changes to them or add customizations.
• Use the conversion tools provided with Operations Manager 2007 to convert your
Microsoft Operations Manager 2005-format Management Pack (.akm) files to the
correct Operations Manager Management Pack format.

Creating a Management Pack in the Operations Manager 2007


Operations Console
If you have not imported the management pack from another source, you will need to use the
Operations Console in System Center Operations Manager 2007 to create a Management Pack
for your application. Although the modeling concepts in Operations Manager 2007 more closely
match the management model concepts contained in this guide, there are still complications in
mapping the two. By using rules and monitors, you can you can update state variables that
correspond to the health state of an application. You can also define tasks and probes that
execute scripts or commands at specified intervals.
Rules that detect an event can generate an alert to display in the console, and they can send a
message to operators via e-mail or pager. You can create notification groups and add operators
to these groups to make it easier to manage notification. Rules that collect data and store it in
the Operations Manager 2007 database do not create alerts directly. However, you can
associate unit monitors with one or more event or performance rules so that the monitor raises
an alert when Operations Manager collects an event or a performance counter value that
matches the criteria for that monitor.
Other types of monitors you can create include probe monitors that check the status of a
process, such as a Web application or a database, at pre-defined intervals and raise an alert if
the target component fails; and Rollup Monitors that determine the overall health state
exposed by a set of rules and monitors.
By assigning individual rules and monitors to groups, you can associate these rules and monitors
with a specific section or component of the application. This makes it easier to update the
monitoring configuration as the physical layout of the monitored application and its
components change over time. You can also assign knowledge from the management model
that is common to a set of rules to the group, reducing duplication of effort and making
knowledge updates easier.
In Operations Manager 2007, a monitoring group is referred to as an instance group, and can
contain nested subgroups. You can also create a distributed application that consists of a series
of related managed objects; therefore, it contains all the services and components of your
application. A distributed application consists of a series of nested instance groups, and you can
use templates provided with Operations Manager 2007 to create distributed applications of
various types.
After creating the groups, you can create the rules and monitors for the application—associating
these rules and monitors with the appropriate groups as you create them. You can then handle
the monitored application as one entity. This makes it much easier to manage the creation,
monitoring, and editing of the rules and groups.
This section contains the following procedures:
• To create a new Management Pack in the Operations Manager 2007 Operations
Console
• To create a new distributed application in the Operations Manager 2007 Operations
Console
• To create a new monitoring group in the Operations Manager 2007 Operations Console
• To create a new rule for a group in the Operations Manager 2007 Operations Console
• To create a probe monitor in the Operations Manager 2007 Operations Console
• To create a unit monitor in the Operations Manager 2007 Operations Console
• To create a health rollup monitor in the Operations Manager 2007 Operations Console

To create a Management Pack in the Operations Manager 2007 Operations Console


1. On the taskbar, click Start, point to System Center Operations Manager 2007, and
then click Operations Console. In the navigation pane, click the Administration
button. If the navigation pane is not visible, click Navigation Pane on the View
menu.
2. Expand the tree view in the left pane of the main window to show the Management
Packs node if it is not already visible. Right-click the Management Packs node, and
then click Create Management Pack.
3. Enter the name for the Management Pack and the version number. For a new
Management Pack, use the version number 1.0.0.0. Also, enter a description that
will help administrators and operators to identify the Management Pack.
4. Click Next to display the Knowledge Article page. Click the Edit button, and enter the
knowledge that will help administrators and operators to diagnose, resolve, and
verify resolution of errors.
You must have Microsoft Office and the Visual Studio Tools for Office runtime on the
computer where you want to create and edit the knowledge for Management Packs.
5. Click Create, and the new Management Pack appears in the list of all Management
Packs in the main window of the Operations Console.

To create a distributed application in the Operations Manager 2007 Operations Console


1. On the taskbar, click Start, point to System Center Operations Manager 2007, and
then click Operations Console. In the navigation pane, click the Authoring button. If
the navigation pane is not visible, click Navigation Pane on the View menu.
2. Expand the tree view in the left pane of the main window to show the Distributed
Applications node if it is not already visible. Right-click the Distributed Applications
node, and then click Create a new distributed application.
3. In the Distributed Application Designer dialog box that opens, enter the name for
the new distributed application and a description that will help administrators and
operators to identify the application.
4. Select a template that will help you to define the application from the Template list,
such as Line of Business Web Application or a Messaging Application. As you select
each template, the dialog box displays a description of the target application type.
To see more details of the selected type, click the View Details link next to the
Template list. If you want to create the distributed application hierarchy yourself,
select Blank (Advanced) in the Template list.
5. Specify the Management Pack to which you want to add the new distributed
application. Select the Management Pack you created in the previous procedure. If
you have not already created a Management Pack, click the New button and follow
the instructions in the earlier procedure, "To create a new Management Pack in the
Operations Manager 2007 Operations Console."
6. Click OK in the Distributed Application Designer dialog box. You now see the
Distributed Application Designer window, where you can design the distributed
application model.
7. If you selected one of the existing templates, you will see the objects and
relationships from that template in the Designer window. For example, Figure 1
shows the result of selecting the Line of Business Web Application template.

Figure 1
The Distributed Application Designer
8. If you selected the Blank (Advanced) option in the Distributed Application Designer
dialog box, you will see an empty designer surface. To add items to the designer,
click the Add Component button in the toolbar at the top of the window to open the
Create Component Group dialog box, where you specify the type of component you
want to add. Enter a name for the new component, and select the Objects of the
following types(s) option. Then select the component type in the tree view at the
bottom of the Create Component Group dialog box.
The list contains a wide selection of possible component types. For a Web-based
application or Web service, expand the Application Component node of the tree view
to see components such as Database and Web Site. For a Windows-based application,
expand the Local Application node of the tree view and then expand the Windows
Local Application node to see the various types of user and local application types.
These include Health Service components such as a Management Server, Notification
Server, Windows Cluster Service, Windows Local Service, and Windows User
Application.
9. To create a relationship between the items you add to the designer, click the Create
Relationship button in the toolbar at the top of the window, click the source item in
the relationship, and then click the target item. This creates a relationship such that
the source item "uses" (depends on) the target item and the arrow points towards
the target item. Click the Create Relationship button again to switch out of the
Create Relationship mode and return to the normal "arrow" mouse pointer.
You use component groups and relationships to separate the sets of rules for each
component into logical groups that correspond to the separation between the
components of the application. You can apply rollup rules to the overall health state of
the component groups to generate the appropriate health state indication at higher
levels of the application structure.
10. Click the Save button in the toolbar at the top of the window and close the
Distributed Application Designer window.

To create a new monitoring group in the Operations Manager 2007 Operations Console
1. On the taskbar, click Start, point to System Center Operations Manager 2007, and
then click Operations Console. In the navigation pane, click the Authoring button. If
the navigation pane is not visible, click Navigation Pane on the View menu.
2. Expand the tree view in the left pane of the main window to show the Groups node
if it is not already visible. Right-click the Groups node, and then click Create a new
Group.
3. In the Create Group Wizard dialog box, enter a name for the group and type in a
description that will help administrators and operators to identify the group.
4. Select the Management Pack to which you want to add the new group in the drop-
down list at the bottom of the dialog box. If you have not already created a
Management Pack, click the New button and follow the instructions in the earlier
procedure, "To create a new Management Pack in the Operations Manager 2007
Operations Console."
5. Click Next in the Create Group Wizard dialog box to show the Choose Members from
a List page. On this page, you can explicitly choose the members for the new group.
To add a member, click the Add/Remove Objects button to open the Object
Selection dialog box. Select the type of entity you want to add in the Search for
drop-down list or leave the list set to Entity to search for all suitable objects. Enter
all or part of the name of the items you want to find in the Filter by part of name
text box or leave it blank to search for all items of the selected type.
6. Click the Search button to display the items that match your selection in the
Available items list. The list shows the available entities (computers, databases,
sites, and applications) based on a range of features, such as the name, operating
system, or status within the Operations Manager 2007 environment (such as
Notification Server). Select the individual items you want to add, and then click the
Add button. You can hold down the SHIFT and CTRL keys while clicking to select
more than one item. To remove an item selected in the Selected objects list, click
the Remove button.
7. Click OK in the Object Selection dialog box to return to the Create Group Wizard
dialog box, and then click Next to show the Create a Membership Formula page. On
this page, you can create rules and use formulae to automatically select computers
to add to the new group. Click the Create/Edit rules button to open the Query
Builder dialog box, and then select the type of items you want to add to the group in
the drop-down list at the top of the window.
8. Click the Add button to create a row in the grid where you can specify an expression
for selecting items. In the first column of the conditional expression row, select a
property for the item you added, such as the Display Name, and then select the
criteria for matching the property in the second column of the grid. You can use a
range of criteria, such as partial and full string matching on the name, wild-card
string matching, and regular expressions. Enter the criteria value for this row in the
third column of the grid. Then repeat the process to add more conditional
expressions to the grid as required.
Clicking the Insert button adds a conditional expression row to the grid. However, if
you click the small "down arrow" next to the Insert button, you can create a series of
AND and OR groups containing conditional expressions. Select an expression row and
click the Formula button to view the conditional expression for that row, or click the
Delete button to remove any row from the grid.
9. After you create any rules you require for selecting objects, click Next in the Create
Group Wizard dialog box to show the Choose Optional Subgroups page. On this
page, you can select other groups you have already created to build a hierarchy of
groups that allows you to use rollup rules to expose the health state of the group
members as a whole. Click the Add/Remove Subgroups button to open the Group
Selection dialog box, and enter any part of the name of the group(s) you want to add
in the text box at the top of the window. If you want to see a list of all groups, leave
the text box empty.
10. Click the Search button, and the Available items list shows all available groups.
Select the groups you want to add as children of the new groups, and then click the
Add button. You can hold down the SHIFT and CTRL keys while clicking to select
more than one item. To remove an item selected in the Selected objects list, click
the Remove button.
11. Click OK in the Group Selection dialog box to return to the Create Group Wizard
dialog box, and then click Next to show the Specify Exclude List page. Here, you can
specify any objects you do not want to include in the group, which the previous rules
set up in the Create Group Wizard would include.
12. Click the Exclude Objects button to open the Object Exclusion dialog box, and select
the type of entity you want to exclude in the Search for drop-down list or leave the
list set to Entity to search for all suitable objects. Enter all or part of the name of the
items you want to find in the Filter by part of name text box or leave it blank to
search for all items of the selected type.
13. Click the Search button to display the items that match your selection in the
Available items list, select the individual items you want to exclude, and then click
the Add button to add them to the Selected objects list. You can hold down the
SHIFT and CTRL keys while clicking to select more than one item. To remove an item
selected in the Selected objects list, click the Remove button.
14. Click OK in the Object Exclusion dialog box to return to the Create Group Wizard
dialog box, and then click Create. The new group appears in the Groups list in the
Operations Console.

To create a new rule for a group in the Operations Manager 2007 Operations Console
1. On the taskbar, click Start, point to System Center Operations Manager 2007, and
then click Operations Console. In the navigation pane, click the Authoring button. If
the navigation pane is not visible, click Navigation Pane on the View menu.
2. Expand the tree view in the left pane of the main window to show the Rules node
(which is under the Management Pack Objects node) if it is not already visible, and
then click the Rules node to select it. The main window shows a list of all the rules
installed in Operations Manager 2007, grouped by type.
3. Click the Change Scope hyperlink the in small notification area above the list to open
the Scope MP Objects by target(s) dialog box. You can use this feature to limit the
list of items to those within a particular scope (such as a Management Pack, group,
or distributed application); which makes it easier to find and work with the rules and
other objects you create.
4. Type part of the name of the group or Management Pack you want to scope in the
Look for text box at the top of the Scope MP Objects by target(s) dialog box. The list
changes to reflect matching item and a tick appears in the check boxes of these
matching items. To see all the items, select the View all targets option button, select
the check boxes of any other targets you want to include, and then click OK.
Alternatively, use the Look for text box and the Find Now button below the
notification area to select specific rules that match a search string.
5. Right-click the Rules node in the left-side tree view, and then click New rule, or click
the New rule link on the toolbar or in the Actions window at the right of the main
window to start the Create Rule Wizard. If you cannot see the Actions window, click
Actions on the View menu.
6. The Select a Rule Type page of the Create Rule Wizard allows you to select the type
of rule you want to create. You can create an alert generating rule based on an
event; a collection rule based on an event, a performance counter, or a probe; or a
timed command that executes a command or a script. Figure 2 shows rule type
selection page of the wizard.
Figure 2
The different types of rule available in the Create Rule Wizard
• For an alert generating rule, you can select the following:
◦ Generic CSV Text Log (Alert). This rule type matches against the entries stored
in a Comma-Separated-Values log file and generates an alert when a value that
you specify using a pattern matches an entry in the log file.
◦ Generic Text Log (Alert). This rule type matches against the entries stored in a
generic text log file and generates an alert when a value that you specify using a
pattern matches an entry in the log file.
◦ NT Event Log (Alert). This rule type matches against the properties of events in
Windows Event Log of the monitored computers and generates an alert when a
matching event occurs. You can match on any of the fields of an event log entry,
such as the name, computer name, event number, category, and description.
◦ SNMP Trap (Alert). This rule type listens for events generated by specific classes
and traps of an SNMP provider on the monitored computers and generates an
alert when a matching event occurs.
◦ Syslog. This rule type matches against syslog entries forwarded to the
monitored computers, and generates an alert when a matching event occurs.
You can match on any of the values in the incoming syslog entry.
◦ WMI Event (Alert). This rule type uses a Windows Management Instrumentation
(WMI) query within a namespace you specify, which runs at intervals you define,
to query WMI objects and generate an alert when a query match occurs.
• For a collection rule, you can select from three categories of rule type, located in the
three folders named Event Based, Performance Based, and Probe Based. The collection
rule types are the following:
◦ Generic CSV Text Log (event-based rule). This rule type collects and logs to the
Operations Manager database entries stored in a comma-separated values log
file, using pattern matching to locate entries in the log file.
◦ Generic Text Log (event-based rule). This rule type collects and logs to the
Operations Manager database entries stored in a generic text log file, using
pattern matching to locate entries in the log file.
◦ NT Event Log (event-based rule). This rule type collects and logs to the
Operations Manager database events occurring in the Windows Event Log of the
monitored computers.
◦ SNMP Event (event-based rule). This rule type collects and logs to the
Operations Manager database events from a specified SNMP provider on the
monitored computers.
◦ SNMP Trap (Event) (event-based rule). This rule type collects and logs to the
Operations Manager database event traps from a specified SNMP provider on
the monitored computers.
◦ Syslog (event-based rule). This rule type collects and logs to the Operations
Manager database syslog entries forwarded to the monitored computers.
◦ WMI Event (event-based rule). This rule type uses a WMI query within a
namespace you specify, which runs at intervals you define, to collect and log
results to the Operations Manager database.
◦ SNMP Performance (performance-based rule). This rule type collects and logs to
the Operations Manager database performance counters exposed by a specified
SNMP provider on the monitored computers.
◦ WMI Performance (performance-based rule). This rule type collects and logs to
the Operations Manager database performance counters exposed through WMI
on the monitored computers.
◦ Windows Performance (performance-based rule). This rule type collects and
logs to the Operations Manager database values from Windows performance
counters defined on the monitored computers.
◦ Script (Event) (probe-based rule). This rule type collects and logs to the
Operations Manager database details of events that cause a specified script to
run when a matching event occurs on the monitored computers.
◦ Script (Performance) (probe-based rule). This rule type collects and logs to the
Operations Manager values of performance counters that cause a specified
script to run when a matching event occurs on the monitored computers.
• For a timed command, the rule types are the following:
◦ Execute a Command. This rule type runs a specified command using the
Operations Manager Windows command shell at the intervals you specify.
◦ Execute a Script. This rule type runs a specified script, either VBScript or JScript,
at the intervals you specify.
7. While you are still on the Select a Rule Type page, select the Management Pack to
which you want to add the new rule in the drop-down list at the bottom of the
dialog box. If you have not already created a Management Pack, click the New
button and follow the instructions in the earlier procedure, "To create a new
Management Pack in the Operations Manager 2007 Operations Console."
8. Click Next, and enter a name for the new rule and a description that will help
administrators and operators to identify the rule. Then click the Select button to
open the Select a Target Type dialog box. This dialog box shows a list of all the types
of object to which you can apply the new rule. Type part of the name of the entity
(group, computer, or Management Pack) you want to apply the rule to in the Look
for text box at the top of the dialog box. The list changes to reflect matching items
and the check boxes of these matching items become selected. To see all the
available entities, select the View all targets option button, select the check boxes of
any other targets you want to include, and then click OK.
9. Make sure that the Rule is enabled check box is selected (unless you want to create
the new rule but not enable it yet), and then click Next. The page you see next
depends on the type of rule you are creating:
◦ For a rule that uses a Generic Text (or CSV) Log as its source, you see the
Application Log Data Source page where you specify the source log file path and
name, and the pattern you want to use to match values in the log file. You can
also specify if the log file is UTF8 format instead of the more usual UTF16
format. Then click Next to show the Build Event Expression page, where you
specify how the rule will map to values your pattern selects from the log file.
You can use a range of criteria, such as partial and full string matching on the
name, wild-card string matching, and regular expressions. Enter the criteria
value for this row in the third column of the grid. Click the Insert button to add
more conditional expressions to the grid as required.
Clicking the Insert button adds a conditional expression row to the grid.
However, if you click the small "down arrow" next to the Insert button, you
can create a series of AND and OR groups containing conditional expressions.
Select an expression row and click the Formula button to view the conditional
expression for that row, or click the Delete button to remove any row from
the grid.
◦ For an NT Event Log or an NT Event Log (Alert) rule, you see the Event Log Name
page where you specify the source event log (such as Application, System, or
Security). Click the ellipsis button (...) to open a dialog box where you can select
a computer, and then select from the list of all available Windows Event Logs on
that computer. Click OK to return to the Event Log Name page, and then click
Next to show the Build Event Expression page where you specify how the rule
will map to events in the Windows Event Log. You can match on the standard
event properties, or use a numbered parameter, and specify a conditional
expression to match to that property value. You can use a range of criteria, such
as partial and full string matching on the name, wild-card string matching, and
regular expressions (see Figure 3). Enter the criteria value for this row in the
third column of the grid. Click the Insert button to add more conditional
expressions to the grid as required.
Figure 3
Specifying the mapping between a Windows Event and an event rule
◦ For a rule that uses SNMP as its data source, you see an SNMP object identifier
configuration page. Here, you must specify the discovery or a community string
that identifies the SNMP provider. If you are creating a collection rule, you can
also change the collection frequency using the drop-down list on this page. Then
specify the object identifier properties for each property you want to access.
Alternatively, if you are creating an alert generating rule, you can select the All
Traps check box.
◦ For a rule that uses a forwarded syslog entry as its data source, you see a Build
Event Expression page similar to that for the NT Event Log rule types. You can
use a range of criteria to match the value your pattern selects from the log
entry, such as partial and full string matching on the name, wild-card string
matching, and regular expressions. Enter the criteria value for this row in the
third column of the grid. Click the Insert button to add more conditional
expressions to the grid as required. Click the small "down arrow" next to the
Insert button to create AND and OR groups containing conditional expressions.
◦ For a rule that uses WMI as its data source, you see the Configure WMI Settings
page. Here, you specify the WMI namespace and the query. You can change the
polling interval using the drop-down list in this page.
◦ For a Windows performance rule, you see the Performance Object, Counter, and
Instance page. Click the Browse button to display the Select Performance
Counter dialog box and select the source computer, the performance counter
object (either a built-in object such as .NET CLR Data or your application
performance counter object), and the actual counters contained in this counter
object. Click the Explain button to see the explanatory text for the selected
counter. Then click OK to automatically populate the Object, Counter, and
Instance text boxes. Alternatively, you can use pattern matching strings for
these values to select multiple counters. You can also select the check box below
the text boxes to specify that the rule should include all instances of the
specified counter. Finally, change the collection Interval settings as required,
and click Next to show the Optimized Performance Collection Settings page.
Here, you must specify a tolerance for changes in the sample values collected
from the data source. Low tolerance (low optimization) means that small
changes in the values will cause Operations Manager to create a database entry,
while high tolerance (high optimization) provides information on changes in
performance that is more granular but stores less data. You can also specify an
absolute tolerance value or a percentage (see Figure 4).

Figure 4
Specifying the Optimized Performance Collection Settings
◦ For a Script (Event) or a Script (Performance) rule, you see the Schedule page,
where you specify how often the script should execute. The default is every 15
minutes, and you can enter a specific synchronization time from which the
intervals are measured. Then click Next to open the Script page, where you
enter the name of the script to execute, specify the script timeout, and select
the language (VBScript or JScript). Then edit the script in the window or click the
Edit in full screen button, and then type (or copy and paste) the script you
require. If your script requires parameters, click the Parameters button and
enter the parameter names. You can click the Target button next to the
Parameters list to insert property placeholders such as the display name or ID of
the computer or management group. Click OK and, back on the Script page, click
Next. If you are creating a Script (Event) rule, you see the Event Mapper page.
Use the ellipsis buttons (...) next to each text box to specify the Computer, Event
source, Event log, Event ID, Category of the event that will cause the script to
execute, and select the Level (such as Information, Warning, or Error) from the
drop-down list. If you are creating a Script (Performance) rule, you see the
Performance Mapper page. Use the ellipsis buttons (...) next to each text box to
specify the Object, Counter, Instance, and Value of the counter that will cause
the script to execute.
◦ For the Execute a Command rule, you see the Specify your Schedule Settings
page, where you can specify execution of the command or script on a simple
recurring interval basis or create a weekly schedule to execute the command or
script. After creating a suitable schedule, click Next to show the Configure
Command Line Execution Settings page. Here, you specify the full path and
name of the program to execute and any parameters you want to pass to that
program. You can click the arrow button next to the Parameters text box to
insert value placeholders, such as the display name or ID of the computer or
management group. In the Additional settings section of this page, you can
specify the working directory for the program, whether to capture the program
output, and the timeout for program execution.
◦ For the Execute a Script rule, you see the Specify your Schedule Settings page,
where you can execute the command or the script at a simple recurring interval,
or you can create a weekly schedule to execute the command or script. After
creating a suitable schedule, click Next to show the Script page. This is the same
page that appears for the Script (Event) or a Script (Performance) rules discussed
earlier and allows you to create the script to execute for this rule.
10. If you are creating an alert-generating event, you now see the Configure Alerts page.
On this page, you must specify the Name, Description, Priority, and Severity of the
alert that the rule will generate. Select Low, Medium, or High in the Priority drop-
down list, and Warning, Information, or Critical in the Severity drop-down list. If you
want to suppress repeated occurrences of this alert, click the Alert suppression
button to open the Alert Suppression dialog box, and select the check boxes next to
the fields of the source event that must have identical values for the alert to be
considered as a duplicate and suppressed.
You can use custom fields to pass values from event rules to alerts and monitors. Click
the Custom alert fields button and enter the values for any of these fields you want to
use or click the ellipsis button (...) next to a field text box and select a value from the
target entity or the source alert in the lists available in the Alert Description dialog box
that appears.
11. Click Create on the final page of the Create Rule Wizard and the new rule appears in
the list in the main window of the Operations Console.
To create a probe monitor in the Operations Manager 2007 Operations Console
1. On the taskbar, click Start, point to System Center Operations Manager 2007, and
then click Operations Console. In the navigation pane, click the Authoring button. If
the navigation pane is not visible, click Navigation Pane on the View menu.
2. Expand the tree view in the left pane of the main window to show the Management
Pack Templates node if it is not already visible, and then expand this node to show a
list of available templates. Right-click the Management Pack Templates node or one
of the template nodes, and then click Add monitoring wizard.
3. On the Select Monitoring Type page of the Add Monitoring Wizard, select the type of
probe monitor you want to create from the list. The four template types are the
following:
◦ OLE DB Data Source. This probe monitor tests the connectivity to any OLE-DB
compliant database at the specified intervals.
◦ TCP Port. This probe monitor sends a "ping" to the specified port on a specified
computer at the specified intervals.
◦ Web Application. This probe monitor sends one or more HTTP requests to a
specified Web site at the specified intervals.
◦ Windows Service. This probe monitor sends commands to a specified Windows
service at the specified intervals.
4. Click Next to show the General Properties page, and enter a name for the probe
monitor. Enter a description that will help administrators and operators to identify
the monitor. Then select the Management Pack to which you want to add the new
monitor in the drop-down list at the bottom of the dialog box. If you have not
already created a Management Pack, click the New button and follow the
instructions in the earlier procedure, "To create a new Management Pack in the
Operations Manager 2007 Operations Console."
5. The page you see next depends on the type of rule you are creating:
◦ OLE DB Data Source. For this type of probe monitor, you see a page where you
specify the connection details for the database. You can specify a Simple
Configuration using the Provider name, the IP address or device name, and the
name of the Database. Alternatively, you can select Advanced Configuration
and provide the full connection string. Click the Test button to check the
connection.
◦ TCP Port. For this type of probe monitor, you see a page where you specify the
IP address or device name and the Port number of the target computer you
want to probe. Click the Test button to check the availability of the specified
port.
◦ Web Application. For this type of probe monitor, you see a page where you
specify the URL of the Web application or Web page you want to probe. Click
the Test button to check the availability of the specified URL.
◦ Windows Service. For this type of probe monitor, you see a page where you
specify the service name. Click the ellipsis button (...) to open the Select
Windows Service dialog box, where you can select a computer and see a list of
the available services on that computer. Then go directly to step 9.
6. Click Next and, for all types except the Windows Service monitor, you see the
Choose Watcher Nodes page. This displays a list of all computers running the
Microsoft Operations Manager remote agent. Select the checkbox next to the
computer(s) that you want to execute this probe monitor. You can execute it from
the Microsoft Operations Manager management server or any of the remote agent-
managed computers in the management group.
7. Use the controls at the bottom of the Choose Watcher Nodes page to change the
frequency at which the probe monitor executes to the required value. The default is
every two minutes.
8. Click Next to see a summary of your settings. If you are creating a Web Application
probe monitor, you can select the check box at the bottom of this page to start the
Web Application Editor, where you can specify exact details of the request, create
group requests, and even record navigation using your Web browser.
9. Click Create to create the new monitor and close the wizard. Then expand the tree
view in the left pane of the main window, and then select the Monitors node (which
is under the Management Pack Objects node) if it is not already visible. The main
section of the window shows a list of all the monitors. You can use the Change Scope
link the in toolbar to limit the list of items to those within a particular scope (such as
a Management Pack, Group, or Distributed Application); which makes it easier to
find and work with the monitors you create.

To create a unit monitor in the Operations Manager 2007 Operations Console


1. On the taskbar, click Start, point to System Center Operations Manager 2007, and
then click Operations Console. In the navigation pane, click the Authoring button. If
the navigation pane is not visible, click Navigation Pane on the View menu.
2. Expand the tree view in the left pane of the main window to show the Monitors
node (which is under the Management Pack Objects node) if it is not already visible,
and select it. Use the Change Scope link the in toolbar to limit the list of items to
those within the required scope, and expand the nodes below the Entity Health
node in the main Operations Console window (see Figure 5).
Figure 5
The list of monitors for a distributed application

3. For each entity (such as a distributed application or group), you can create monitors
for the four categories: Availability, Configuration, Performance, and Security.
Right-click the category node to which you want to add a new monitor, click Create a
Monitor, and then click Unit Monitor to start the Create Monitor Wizard.
4. On the Select a Monitor Type page of the Create Monitor Wizard, select the type of
monitor you want to create. There are many different types available, organized in
folders denoting the type (see Figure 6). These monitor types equate to the rule
types described in more detail in the earlier procedure, "To create a new rule for a
group in the Operations Manager 2007 Operations Console."
Figure 6
Some of the different types of unit monitor you can create

There are five basic types of unit monitor. You can create a monitor that reacts to an event, to
the changes in a performance counter value, to the result of executing a custom script, to an
SNMP event or trap, or which monitors a Windows service.
For an event monitor, you can detect one or more events that match a specified criteria (a
correlated event), a combination of different events occurring over a specified period, a
missing event that you expect to occur, or a series of repeated events. You can also specify if
the operator must reset the state manually, or if another event or a timer can reset the state.
For a performance monitor, you can detect specific values or specify threshold value ranges,
and expose a two-state (RED and GREEN) or a three-state (RED, YELLOW, and GREEN) health
status. You can also create a baseline performance monitor that measures average
performance over time. This is useful for measuring adherence to service level agreements
(SLAs) and estimating performance capabilities of the application and its individual
components.
For a script monitor, you can expose a two-state or a three-state health status.
For an SNMP monitor, you can detect a combination of different events occurring over a
specified period.
For a Windows service monitor, you can detect changes to the state and operation of the
service.

5. The following pages of the Create Monitor Wizard collect all the information
required for the specific monitor type you select. Each asks first for the name and
description of the monitor, and the Management Pack to add it to. Then there is a
different series of pages, but all follow the same basic pattern. The first steps help
you to set up the correlation (mapping) between one or more source events,
counters, or script executions and the new monitor:
◦ Event Monitor. For this type of monitor, you specify the name of the correlated
event logs, and expressions that match the events you want to monitor. This is a
similar process to that described in the earlier procedure for creating an event
rule.
◦ Performance Monitor. For this type of monitor, you specify the counter name
and location, and the threshold values. You can also use this type of monitor to
create baseline information (including varying the "learning rate") that indicates
the average performance of the monitored application or its individual
components over long or short business cycles (see Figure 7).
Figure 7
Specifying the threshold and learning cycle values for a baseline performance monitor
◦ Script Monitor. For this type of monitor, you specify the script to execute, and
any parameters it requires.
◦ SNMP Monitor. For this type of monitor, you specify one or more expressions
that match the SNMP traps or probes.
◦ Windows Service Monitor. For this type of monitor, you specify the location and
name of the service you want to monitor.
6. Complete the remaining pages of the Create Monitor Wizard. These pages include
the following:
◦ The Configure Health page that allows you to specify the health states that
Operations Manager will display when the correlated event, counter threshold,
or script execution occurs. You assign a Critical (RED), Warning (YELLOW), or
Healthy (GREEN) health state to each occurrence or value of the correlated
event, counter, or script execution.
◦ The Configure Alerts page that allows you to specify if changes to the state
detected by this monitor will raise an alert to display in the console (and,
optionally, send it to operators as an e-mail or pager message). You also specify
the severity of the alert here.
7. Click Create to create the new monitor and you will see it appear in the list in the
main window of the Operations Console.
8. To add product or company knowledge to a monitor, select it in the list in the main
window, right-click, and then click Properties. Open the Product Knowledge page,
click the Edit button, and enter the required information that helps operators and
administrators to diagnose, resolve, and verify resolution of the problem.

To create a health rollup monitor in the Operations Manager 2007 Operations Console
1. On the taskbar, click Start, point to System Center Operations Manager 2007, and
then click Operations Console. In the navigation pane, click the Authoring button. If
the navigation pane is not visible, click Navigation Pane on the View menu.
2. Expand the tree view in the left pane of the main window to show the Monitors
node (which is under the Management Pack Objects node) if it is not already visible,
and select it. Use the Change Scope link the in toolbar to limit the list of items to
those within the required scope, and expand the nodes below the Entity Health
node in the main Operations Console window to see the four categories:
Availability, Configuration, Performance, and Security.
3. If you want to create a rollup monitor that reflects the health state of a complete
distributed application or a top-level group, select the Entity Health node for that
distributed application or group. If you want to create a rollup monitor that reflects
the health state of one of the four categories below the Entity Health node
(Availability, Configuration, Performance, and Security), select that node instead.
4. Right-click the selected node, click Create a monitor, and then click either
Dependency Rollup Monitor or Aggregate Rollup Monitor to start the wizard.
A dependency rollup monitor allows you to specify the rollup policy based on subsets
of computers or components within the same group and specify what state to expose
when monitoring is unavailable or the computers are in maintenance mode
(temporarily disconnected from the monitoring system). An aggregate rollup monitor
simply exposes the best or worse state of all the computers or components within the
group.
5. In the General Properties page of the wizard, enter a name for the monitor and a
description that will help administrators and operators to identify the monitor. Then
click the Select button to open the Select a Target Type dialog box. This dialog box
shows a list of all the types of object to which you can apply the new monitor. Type
part of the name of the entity (group, computer, or Management Pack) you want to
apply the monitor to in the Look for text box at the top of the dialog box. The list
changes to reflect matching items and the check boxes of these matching items
become selected. To see all of the available entities, click the View all targets option
button and select the check boxes of any other targets you want to include.
6. Click OK to close the Select a Target Type dialog box, and select the appropriate
parent monitor that will act as a rollup for this monitor from the list on the main
wizard page.
7. Select the Management Pack to which you want to add the new monitor in the drop-
down list at the bottom of the dialog box. If you have not already created a
Management Pack, click the New button and follow the instructions in the earlier
procedure, "To create a new Management Pack in the Operations Manager 2007
Operations Console." Also make sure that the Monitor is enabled check box is
selected unless you do not want to enable the monitor immediately. Then click Next.
8. If you are creating an aggregate rollup monitor, the Health Rollup Policy page you
see next allows you to specify if the health state exposed by the monitor is that of
the worst state of any member of the group or the best state of any member of the
group. Select the required option, and then go to step 12 of this procedure.
As an example, if you select Worst state of any member while one computer has a
Warning state, one has an Critical state, and the rest have a Healthy state, the
monitor will show Critical. If you select Best state of any member while one computer
has a Warning state, one has an Critical state, and the rest have a Healthy state, the
monitor will show Healthy.
9. If you are creating a dependency rollup monitor, the next wizard page contains a
tree-view list of the entities related to the current entity for which you are creating
the monitor. These relationships match those that you (or the template you used in
the Distributed Application Designer) created. You also see all the subgroups within
the current group. Expand the target entity or group for which you are creating a
Monitor, and you see the Entity Health node and the four category nodes,
Availability, Configuration, Performance, and Security. Within each of these nodes
are any Monitors you have already created, and any default monitors created by the
Distributed Application Designer (see Figure 8).
Figure 8
Selecting the target entity for a Dependency Rollup Monitor
10. Select the node for which you want to rollup the state of the members, and click
Next to show the "Configure Health Rollup Policy" page. This page allows you to
specify how the overall state for the group will reflect the states of individual
members of the group. The three options in this page (see Figure 9) are:
◦ Worst state of any member. If you select this option, Operations Manager will
set the State value displayed in the Operations Console to that specified for the
Severity for the worst of the current unresolved alerts for the members of this
group.
◦ Worst state of the specified percentage of members in good health state. If
you select this option, you must specify a percentage that defines the
proportion of the group will act as the state indicator for the group. Operations
Manager will select a set of members from the group that consists of the
computers with the best health state up to the percentage you specified of the
total group membership. In other words, if there are 10 computers and you
specify 60%, Operations Manager will select the six members of the group that
currently have the least severe state. It then uses the worst (the most severe)
state of the subset it selects as the overall (rolled-up) state for the group, and
displays this in the Operations Console as the State value for this group.
◦ Best state of any member. If you select this option, Operations Manager will set
the State value displayed in the Operations Console to that specified for the
Severity for the best of the current unresolved alerts for the members of this
group. It is unlikely that you will use this option very often, as it effectively hides
the state of most of the members of the group as long as one member is
performing correctly.

Figure 9
Configuring the Health Rollup Policy for a Dependency Rollup Monitor
11. In the lower section of the "Configure Health Rollup Policy" page, use the two drop-
down lists to specify what state you want to assume for unavailable members of the
group (members where monitoring has failed, or members in maintenance mode). In
the first drop-down list, specify if the Rollup Monitor should treat a failed member's
state as either a Warning or an Error, or just ignore the failed member. In the second
drop-down list, specify if the Rollup Monitor should treat a member in maintenance
mode as either a Warning or an Error, or just ignore this member.
12. Click Next to show the "Configure Alerts" page (for both a Dependency Rollup
Monitor and an Aggregate Rollup Monitor). Set or clear the checkbox at the top of
the Alert Settings section of the page to specify if this Monitor will create an alert to
display in the console and send to operators when the health state changes. If you
turn on alerts, use the drop-down list below this checkbox to specify generation of
an alert for both a Critical state and a Warning state, or just for a Critical state. If
you require the alert to be automatically resolved when the monitor returns to a
Healthy state, set the checkbox below the drop-down list.
13. In the Alert Properties section of the page, enter a name for the Alert, a description,
and select the Priority and Severity you want to assign to the alert. The available
values for Priority are Low, Medium, and High. The available values for Severity are
Critical, Warning, and Information.
14. Click Create and you will see the new Monitor appear in the list in the main window
of the Operations Console.
15. To add product or company knowledge to a monitor, select it in the list in the main
window, right-click, and select Properties. Open the Product Knowledge page, click
the Edit button, and enter the required information that helps operators and
administrators to diagnose, resolve, and verify resolution of the problem.

Guidelines for Creating a Management Pack in the Operations


Manager 2007 Operations Console
When creating a management pack in the Operations Manager 2007 Operations Console, you
should consider the following proven practices:
• Use the management model you developed for your application to help you decide
what rules and performance counters you need to create.
• Either use the Operations Manager 2007 Distributed Application Designer to create a
multi-level hierarchy of related groups as a distributed application that mirrors that of
the management model or create a multi-level hierarchy of instance monitoring groups
and subgroups that mirrors that of the management model.
• Use a name for the distributed application or the top-level monitoring group that makes
it easy to identify. You will later be able to use the top-level group to expose the overall
rolled-up state of the entire application. Each subgroup will expose the rolled-up state
of the members of that group.
• Create only rules and monitors directly relevant to your application. Avoid duplicating
rules and monitors that are available in built-in Management Packs, such as measuring
processor usage or free memory.
• Use the available health configuration settings for monitors to display the health state
of individual components of the application, and use the rolled-up health state
indicators for sections of the application and the application as a whole.
• Use alerts to immediately raise urgent issues to operations staff, perhaps through e-
mail or pager.
• Take advantage of specific features of the monitoring application, such as probes that
can provide heartbeat monitoring of remote services, or the ability to run scripts or
commands in response to alerts (for example, to query values or call a method that
provides diagnostic information then generates a suitable alert).
• Provide as much useful company-specific and application-specific knowledge as possible
for each group and rule to make problem diagnosis, resolution, and verification easier
for operators and administrators.
Editing an Operations Manager 2007 Management Pack
After creating or importing a Management Pack in Operations Manager 2007, you will typically
need to perform additional actions to fine-tune the management pack or to respond to changes
in the operations environment and your management model.
In the case where a Management Pack originated in the Management Model Designer (MMD),
changes are quite commonly required, because the import process does not always generate
the ideal combination of rules and rule groups. For example, the MMD generates an alert that
creates a notification to members of the administration group. However, this group has no
members by default, so you may want to edit this notification, add members to the various
notification groups, or create new notification groups.

This section contains the following procedures:


• To edit the properties of a Management Pack
• To edit a distributed application
• To edit a rule
• To edit a monitor
• To create and edit notification channels and recipients
• To create and edit notification subscriptions
• To view and edit the global settings

To edit the properties of a Management Pack


1. On the taskbar, click Start, point to System Center Operations Manager 2007, and
then click Operations Console. In the navigation pane, click the Administration
button. If the navigation pane is not visible, click Navigation Pane on the View
menu.
2. Expand the tree view in the left pane of the main window and select the
Management Packs node. In the main window, right-click the Management Pack you
want to edit, and then click Properties.
3. The Properties dialog box for a Management Pack contains three tabbed pages:
◦ Properties. On this page, you can edit the Name, Version, and Description
(except for a built-in or sealed Management Pack).
◦ Knowledge. On this page, you can click the Edit button to edit the product and
company-specific knowledge for the Management Pack.
◦ Dependencies. On this page, you can see which other Management Packs
depend upon this one, and which other Management Packs this one depends
upon. You cannot edit these lists.
4. Click OK or Apply to save your changes to the Management Pack properties.

To edit a distributed application


1. On the taskbar, click Start, point to System Center Operations Manager 2007, and
then click Operations Console. In the navigation pane, click the Authoring button. If
the navigation pane is not visible, click Navigation Pane on the View menu.
2. Expand the tree view in the left pane of the main window to show the Distributed
Applications node (which is under the Management Pack Objects node) if it is not
already visible, and select it. In the main window, right-click the distributed
application you want to edit, and then click Edit to open the Distributed Application
Designer.
3. Use the Distributed Application Designer to modify your distributed application as
required. You can add and remove components, change their properties, and add
and remove relationships between the components. For more details about working
with the Distributed Application Designer, see the earlier procedure, "To create a
new distributed application in the Operations Manager 2007 Operations Console."
4. Click the Save button on the toolbar of the Distributed Application Designer when
you finish editing your distributed application.

To edit a rule
1. On the taskbar, click Start, point to System Center Operations Manager 2007, and
then click Operations Console. In the navigation pane, click the Authoring button. If
the navigation pane is not visible, click Navigation Pane on the View menu.
2. Expand the tree view in the left-hand pane of the main window to show the Rules
node (which is under the Management Pack Objects node) if it is not already visible,
and select it. Use the Change Scope link the in toolbar to limit the list of items to
those within the required scope.
3. Select the rule you want to edit, right-click it, and then click Properties (or double-
click the rule). The Properties dialog box contains the following tabbed pages:
◦ General. On this page, you can edit the Rule name and the Description.
However, you cannot change the rule target in this dialog box. To enable or
disable this rule, select or clear the Rule is enabled check box.
◦ Configuration. On this page, you can see the details of the source for the rule,
such as an event log, WMI query, or a performance counter. If the details are
available for editing, you will see an Edit button that opens a source type-
specific dialog box that allows you to change the settings for the source of this
rule. If the details are not editable, you will see a View button that opens a
source type-specific dialog box that allows you to view the settings for the
source of this rule.
◦ Configuration. On this page, you can see a list of any Responses defined for this
rule, such as creating an alert or running a script. If the details are available for
editing, you can click a response in the list and click the Edit button to view and
edit the properties of the selected response. You can also add new responses or
remove existing responses. If the details are not editable, you will see just a
View button that opens a dialog box that allows you to view any properties of
the selected response for this rule.
◦ Product Knowledge and Company Knowledge (for built-in rules). On this page,
you can see the knowledge associated with this rule. Click the Edit button to edit
the company knowledge for a built-in rule.
4. Click OK or Apply to save your changes to the rule properties.

To edit a monitor
1. On the taskbar, click Start, point to System Center Operations Manager 2007, and
then click Operations Console. In the navigation pane, click the Authoring button. If
the navigation pane is not visible, click Navigation Pane on the View menu.
2. Expand the tree view in the left pane of the main window to show the Monitors
node (which is under the Management Pack Objects node) if it is not already visible,
and select it. Use the Change Scope link the in toolbar to limit the list of items to
those within the required scope.
3. Select the monitor you want to edit, right-click it, and then click Properties (or
double-click on the monitor). The Properties dialog box contains the following
tabbed pages:
◦ General. On this page, you can edit the Name and the Description. Although you
cannot change the monitor target in this dialog box, you can select a different
parent monitor if you want a different roll-up monitor to handle state changes
for this monitor. To enable or disable this monitor, select or clear the Monitor is
enabled check box.
◦ Product Knowledge and Company Knowledge (for roll-up monitors). On this
page, you can click the Edit button and edit the product and company-specific
knowledge for this monitor.
◦ Health. On this page, you can specify the heath state (Critical, Warning, or
Healthy) for each monitor condition. For example, you can map the Degraded
monitor state to a Warning health state.
◦ Alerting. On this page, you can edit the settings for an alert that this monitor will
generate.
◦ Diagnostic and Recovery. On this page, you can add, modify, and remove
diagnostic and recovery tasks that will execute when the state of the monitor
changes to Critical or Warning. For example, you can configure a script or a
command to execute.
4. Depending on the type of monitor you are editing, you may see other pages in the
Properties dialog box. These include details of the event, script, counter, WMI query,
log file, Windows service, or other source for the monitor. Each allows you to modify
the settings for this monitor source. There are also pages, depending on the monitor
type, for the schedule to execute a script or command, and the actions that reset the
monitor when the state changes.
5. Click OK or Apply to save your changes to the monitor properties.

To create and edit notification channels and recipients


1. On the taskbar, click Start, point to System Center Operations Manager 2007, and
then click Operations Console. In the navigation pane, click the Administration
button. If the navigation pane is not visible, click Navigation Pane on the View
menu.
2. If you previously specified and configured the notification channels that you want to
be available for sending alerts, go directly to step 10 of this procedure. However, if
you have not already specified and configured the notification channels, you must
do so before you can create recipients and subscriptions.
3. Expand the tree view in the left pane of the main window and select the Settings
node. Right-click the Notification setting in the main window, and then click
Properties. The Notification settings dialog box contains four tabbed pages where
you configure the parameters for Email, Instant Messaging, Short Message Service,
and Command notification channels.
4. On the Email page, select the Enable e-mail notifications check box if you want to
enable e-mail notifications, and then click the Add button to open the Add SMTP
Server dialog box. Specify the fully qualified domain name of the mail server to use,
the Port number (the default is 25), and the Authentication method (Anonymous or
Windows Integrated).
You must configure your mail server to allow the Operations Manager management
server to relay through it if you are not using the local SMTP service.
5. Back in the main setting dialog box, specify a valid Return address for messages, and
a retry interval for failed messages. Then use the two text boxes at the bottom of
the page to specify the Email subject and the Email message to send. Click the
"arrow" button next to these text boxes to insert placeholder strings (replaced by
the actual value when Operations Manager generates the e-mail alert) into the
subject or message body. Finally, specify the required encoding for the message
from the Encoding drop-down list.
6. On the Instant Messaging page, select the Enable instant messaging notifications
check box if you want to enable instant messaging notifications, and then enter the
name of the IM server and the Return address in the two text boxes below this.
Next, specify the IM port number, the Authentication method to use, and the
Protocol option (TCP or TLS).
7. Now use the text box at the bottom of the page to specify the IM message to send.
Click the "arrow" button next to the text box to insert placeholder strings (replaced
by the actual value when Operations Manager generates the IM alert) into the
message body. Finally, specify the required encoding for the message from the
Encoding drop-down list.
8. On the Short Message Service page, select the Enable short message service
notifications check box if you want to enable SMS notifications. Now use the text
box to specify the SMS message to send. Click the "arrow" button next to the text
box to insert placeholder strings (replaced by the actual value when Operations
Manager generates the SMS alert) into the message body. Finally, specify the
required encoding for the message from the Encoding drop-down list.
9. On the Command page, if you want to enable command notifications, click the Add
button to open the Notification Command Channel dialog box. Enter a name and
description for the channel, and enter the full path to the executable file that
implements the channel or command you want to execute for this notification. If
you need to pass parameters to the command, add these in the next text box. Click
the "arrow" button next to this text box to insert placeholder strings (replaced by
the actual value when Operations Manager executes the command) into the
command parameters. Finally, specify the initial directory and click OK to return to
the Command page of the Properties dialog box. You can also edit and remove
existing commands in the list in this page of the Properties dialog box.
10. After you configure the available notification channels, you can create and configure
recipients and subscriptions. Expand the tree view in the left pane of the main
window and select the Notifications node. Expand this node to see the
Subscriptions and Recipients nodes.
11. Right-click the Recipients node, and then click New Notification Recipient. On the
General page of the Notification Recipient Properties dialog box, specify the display
name for the recipient. You can click the ellipsis button (...) to open the Select User
or Group dialog box that allows you to select from all the users and groups
configured on the machine or the domain.
12. Back in the Notification Recipient Properties dialog box, specify if Operations
Manager should send notifications at any time (when they occur) or only during
specified times. If you select the "specified times" option, you can specify periods
when Operations Manager can send messages, and when it cannot. Click the Add
button above the Schedules to send or Exclude schedules list and use the Specify
Schedule Period dialog box that opens to specify the period in terms of the days,
and the start and end times. You can add multiple schedules to both lists.
13. Open the Notification Devices page of the Notification Recipient Properties dialog
box to see a list of channels (if any) already configured for sending notifications to
this user. Click the Add button above the list to open the Create Notification Device
Wizard.
14. On the first page of the wizard, specify the channel (E-mail, IM, SMS, or Custom
Command) and edit the delivery address for this recipient if required.
15. Click Next and enter a name that identifies this channel for this recipient, and then
click Finish to create the new device. You will see it appear in the Notification
Devices list in the Notification Recipient Properties dialog box. Repeat this process
to add a notification device that maps this user to every notification channel you
want them to be able to use. You can add more than one device for each channel
(such as multiple email accounts) if required.
16. Click OK or Apply to save the new recipient, and you will see it appear in the main
window when you select the Recipients node in the right-side tree view.
17. To edit the properties for a recipient, double-click the recipient entry in the main
window to open the Notification Recipient Properties dialog box. To delete a
recipient, right-click the recipient entry in the main window, and then click Delete.

To create and edit notification subscriptions


Subscriptions link an alert generated by Operations Manager with one or more
recipients that will receive the alert. If you not have previously specified and
configured the notification channels that you want to be available for sending alerts,
you must do so before you can perform this procedure to create subscriptions. For
more details, see the earlier procedure, "To create and edit notification channels and
recipients."
1. On the taskbar, click Start, point to System Center Operations Manager 2007, and
then click Operations Console. In the navigation pane, click the Administration
button. If the navigation pane is not visible, click Navigation Pane on the View
menu.
2. Right-click the Subscriptions node (which is under the Notifications node) and click
New Notification Subscription to start the Create Notification Subscription Wizard.
3. Click Next to show the General page, and provide a name and description for this
subscription. Then click the Add button to open the Add Notification Recipient
dialog box that shows a list of all configured recipients. Select the check box of all
those you want Operations Manager to notify as part of this subscription, and click
OK to return to the wizard.
4. Click Next in the wizard to show the User Role Filter page. If you have configured
filtering based on user roles (in the User Roles section of the Security configuration),
you can specify the group or role using the drop-down list in this page. Leave it
empty if you do not want to use role filtering.
5. Click Next to show the Groups page, which contains a list of all the groups you have
configured. Select the check boxes in the list for the groups whose alerts will trigger
notifications through this subscription.
6. Click Next to show the Classes page, which allows you to specify which classes are
"approved" for activating a subscription based on an alert. The default is all classes.
However, you can select the Only classes specifically added option and build a list of
approved classes if you want. Click the Add button to open the Select Management
Pack Objects dialog box and select the Management Packs that you want to include,
and then click OK. Back in the wizard, you can continue to use the Add and Remove
buttons to create the list of approved classes you need.
7. Click Next to show the Alert Criteria page, where you specify which alerts will
activate this subscription. You can select properties from four lists: Severity, Priority,
Resolution State, and Category. Select the check box next to all the criteria you
require. For example (see Figure 10), you can specify that the alert must be an error
or a warning and it must meet the following criteria:
◦ It must have a priority of Medium or High (using the second list).
◦ It must be a New (not a Closed) alert (using the third list).
◦ It must be an alert caused by a Performance Collection or an Event Collection
monitor rule.
Figure 10
Specifying the criteria for activation of a subscription
8. Click Next to show the Alert Aging page, where you specify if the subscription will
respond to the aging of alerts, sending repeated notifications at each stage of the
alert aging process. You can change the period between alerts in this page.
9. Click Next to show the Format page. By default, the subscription will use the settings
specified in the notification channel you created earlier, and which you specified for
this subscription. However, you can change the options from Use the default and
specify a custom subject and message for e-mail, IM, and SMS notifications if
required. Click the "arrow" button next to each text box to insert placeholder strings
(replaced by the actual value when Operations Manager generates the notification)
into the subject or message body.
10. Click Finish to create the new subscription, which appears in the list in the main
window.
11. To edit the properties for a subscription, double-click the subscription entry in the
main window to restart the Notification Subscription Properties Wizard, where you
can modify the settings. To delete a subscription, right-click the recipient entry in the
main window, and then click Delete.
To view and edit the global settings
1. On the taskbar, click Start, point to System Center Operations Manager 2007, and
then click Operations Console. In the navigation pane, click the Administration
button. If the navigation pane is not visible, click Navigation Pane on the View
menu.
2. Expand the tree view in the left pane and select the Settings node. A list of
configuration settings appears in the main window under the three categories
Agent, General, and Server.
3. To edit the settings for the Operations Manager remote agents installed on
monitored computers, double-click the Heartbeat entry in the Agent section.
Heartbeat checking ensures that a remote agent is available. Change the value for
the Heartbeat interval from its default value of 60 seconds as required.
4. To specify how Operations Manager will minimize the size of the database that
stores operational data, double-click the Database Grooming entry in the General
section. Select the type of data you want to change the setting for, such as Resolved
Alerts, and click the Edit button to specify the number of days before Operations
Manager will remove this information from the database.
5. To edit the settings for notifications sent to recipients via subscriptions, double-click
on the Notification entry in the General section. For details about the settings
available in the Global Management Group Settings – Notification dialog box, see
the previous procedure, "To create and edit notification channels and recipients."
6. To edit the settings for participation in feedback programs and error reporting,
double-click the Privacy entry in the General section. The Global Management
Group Settings – Privacy dialog box has four tabbed pages that allow you to do the
following:
◦ CEIP. On this page, you can specify if you want to partake in the Customer
Experience Improvement Program by providing feedback to Microsoft about
how you use Operations Manager 2007.
◦ Error Reporting. On this page, you can specify if you want to send Operational
Data Reports to Microsoft about how you use the product so it can be improved
in accordance with customer requirements.
◦ Error Transmission. On this page, you can specify if Operations Manager will
send error reports to Microsoft when an error occurs within the software, and
whether it will prompt before sending them.
◦ Operational Data Reports. On this page, you can specify a filter for the types
and sources of errors that Operations Manager will send to Microsoft, what
information to include in the error reports, and whether to display links to
possible solutions.
7. To edit the settings for viewing and generating reports, double-click the Reporting
entry in the General section and specify the URL of the reporting server. The
Operations Console uses this information to launch reports.
8. To edit the settings for accessing the Web Console and your own custom online
product knowledge, double-click the Web Addresses entry in the General section.
Specify the URL of the Operations Manager 2007 Web Console, and the URL of the
start page for your product knowledge Help pages or Web site. The Operations
Console uses these values to launch the Web Console and access online knowledge.
9. To edit the way that Operations Manager reacts to failed remote agents, double-
click the Heartbeat entry in the Server section and change the value for the number
of missing heartbeats allowed before Operations Manager will create a failure alert.
The default is three.
10. To edit the security setting for installation of remote agents, double-click the
Security entry in the Server section. For maximum security, select the option to
reject manually installed remote agents. If you want to allow manual installation of
the Operations Manager agent on remote computers, select the second option. You
can then manually approve newly installed agents. If you want Operations Manager
to approve new remote agent installations automatically, select the check box below
this option.
11. Click OK or Apply in each of the settings dialog boxes to save the changes you made.

Guidelines for Editing an Operations Manager 2007 Management


Pack
When editing an Operations Manager 2007 Management Pack, you should consider the
following proven practices:
• Ensure that your Management Pack contains the appropriate distributed applications,
groups, rules, and monitors to match the management model and the instrumentation
exposed by the application. Keep the hierarchy of the groups and the properties of the
rules and monitors as simple as possible, while following the structure and the
requirements of the management model.
• Ensure that you assign the elements of your Management Packs to the appropriate
groups and subgroups that match the physical layout of the computers that will run the
application. Use rollup monitors to expose the rolled up overall health state for a
subgroup, or a specific category (such as Availability or Performance) within that
subgroup.
• Create the appropriate recipients and subscriptions for operators that manage the
application, and other people that have an interest in its operation (such as business
owners and application developers), so that you can automatically send alerts to them.
Configure responses for the rules and alerts that send operational alerts to the
appropriate groups.
• Modify any of the global settings that affect your application. For example, you may
want to use the custom alerts fields for company-specific information, or modify the
alert resolution states and service level periods to suit your requirements.

Deploying the Operations Manager 2007 Agent


After you import a management model to generate a Management Pack, or after you create a
new Management Pack, and designate the servers that will run the application and its
components using one or more instance groups or a distributed application, you must install the
Operations Manager agent on the target computers if you have not already done so.
To deploy the Operations Manager agent to remote computers
1. On the taskbar, click Start, point to System Center Operations Manager 2007, and
then click Operations Console. In the navigation pane, click the Administration
button. If the navigation pane is not visible, click Navigation Pane on the View
menu.
2. Expand the tree view in the left pane and select the Device Management node.
Expand the Device Management node to see the nodes that contain the different
types of computer and device discovered on your network (such as management
servers, agent managed, and agent-less managed computers).
3. Right-click the Device Management node, and then click Discovery Wizard to open
the Computer and Device Management Wizard. This wizard helps you to install the
Operations Manager agent on remote computers.
4. Click Next on the Introduction page to show the Auto or Advanced page. You can
select the first option and allow the wizard to scan your entire domain, but this can
be a long process. Instead, select the second option, Advanced discovery, and select
a value in the drop-down list that corresponds to the types of computers you want
to discover. You can select Servers & Clients (the default), Servers Only, Clients
Only, or Network Devices. Ensure that the correct management server is selected in
the Management Server drop-down list. If you want to ensure that connections can
be made to remote computers, select the check box below this list.
5. Click Next to show the Discovery Method page. Select the first option if you want to
allow the wizard to scan Active Directory to find suitable computers. Click the
Configure button to open the Find Computers dialog box, open the Advanced
tabbed page, and specify a field name, condition, and value that selects computers
that you want to include (see Figure 11). Remember also to select the appropriate
target domain in the Domain drop-down list.
Figure 11
Specifying computers in Active Directory using the Find Computers dialog box
6. Alternatively, you can simply browse or type in the names of the computers you
want to include. In this case, select the second option on the Discovery Method
page. Either type a list of names directly into the text box at the bottom of the page,
or, to select computers, click the Browse button to open the Active Directory Select
Computers dialog box.
7. Click Next to show the Administrator Account page. Here, you must provide the
credentials of an account that has administrator-level privileges on the target
computers, and which Operations Manager will use when installing the remote
agents on these computers. You can specify the existing action account that
Operations Manager creates during installation if you first ensure this has the
relevant privileges. However, it is usually better to select the Other user account
option and enter the credentials of a domain-level administrative account. After
installation completes, the account you specify here is no longer used.
8. Click Discover to start the discovery process. You will see a dialog box that indicates
the status. The Pending Actions node in the tree view (below the Device
Management node) also lists all pending processes such as this discovery process.
9. The status dialog box indicates when discovery and installation is complete.
You can install the Operations Manager agent on a remote computer by running the
Operations Manager setup on that computer, using your setup CD or by loading the
setup .msi file from a network drive. In this case, ensure that you edit the Operations
Manager management server security settings for installation of remote agents. Go to
the Settings section in the Administration view in the Operations Console and double-
click the Security entry in the Server section to open the Global Management Server
Settings – Security dialog box that contains the options to accept remote agent
installation.

Best Practices for Deploying the Operations Manager 2007 Agent


When deploying the Operations Manager 2007, you should consider the following proven
practices:
• Use the Discovery Wizard to find remote computers where you want to install the
Operations Manager agent. You can scan Active Directory, but it is usually quicker to
specify a condition and search for computers that match this condition—for example,
computers whose name starts with a specific string such as "WEB" or "SALES".
• Specify the types of device you want to discover, such as servers and clients, servers
only, or clients only to narrow the search.

Viewing Management Information in Operations Manager 2007


After you import or create a Management Pack for your application, you can use it to monitor
your application. You will usually also take advantage of existing Management Packs provided
with System Center Operations Manager 2007; this allows operators to detect faults in the
underlying infrastructure in addition to errors and performance issues directly associated with
the application. These additional Management Packs allow you to detect faults in the underlying
infrastructure, such as performance degradation or core operating system service failures, and
monitor services such as Microsoft Exchange and SQL Server.
This section includes procedures for using both the Operations Console and the Web Console.
The Operations Console allows you to view the state of an application and drill down to see
details of the events, alerts, performance counters, and computers that run the application. The
Web Console has less functionality, but it can still be of great use to operators, particularly when
the operator console is not installed.
To view management information in Operations Manager 2007
1. On the taskbar, click Start, point to System Center Operations Manager 2007, and
then click Operations Console. In the navigation pane, click the Monitoring button.
If the navigation pane is not visible, click Navigation Pane on the View menu.
2. The first section of the tree view in the left pane displays the five basic monitoring
information views:
◦ Active Alerts. In this view, the main window shows a list of all alerts raised by
the rules and monitors. Select an alert to see the details and the knowledge for
verifying, resolving, and re-verifying this problem in the lower details pane (see
Figure 12). If the details pane is not visible, click Detail Pane on the View menu.
Double-click an alert to see its properties, including a summary of the
knowledge, history, and context. Right-click a computer, click Open on the
shortcut menu, and then select from the four available views: Diagram View,
Event View, Performance View, or State View. You can alternatively open the
Health Explorer for the selected computer from here or a PowerShell command
prompt. Right-click an alert, click Set Resolution State, and then click either New
or Closed to change the resolution state for this alert. If you click Closed,
Operations Manager removes it from the list.

Figure 12
Viewing the knowledge for an unresolved alert in Active Alerts view
◦ Computers. In this view, the main window shows a list of computers within the
current scope, and the health state of each one—including the monitored
features it supports such as Agent, Management Server, or Windows Operating
System. Right-click a computer, click Open on the shortcut menu, and then
select from the five available views: Alert View, Diagram View, Event View,
Performance View, or State View. You can alternatively open the Health
Explorer for the selected computer from here or a PowerShell command
prompt.
◦ Discovered Inventory. In this view, the main window shows the overall state,
display name, and the path to each computer in the current scope. Double-click
a computer to see the properties of that computer. Right-click a computer, click
Open on the shortcut menu, and then select from the four available views: Alert
View, Diagram View, Event View, or Performance View. You can alternatively
open the Health Explorer for the selected computer from here or a PowerShell
command prompt.
◦ Distributed Applications. In this view, the main window shows the distributed
applications in the current scope and the overall state for each one. Right-click
an application, click Open on the shortcut menu, and then select from the five
available views: Alert View, Diagram View, Event View, Performance View, or
State View. You can alternatively open the Health Explorer for the selected
application from here or a PowerShell command prompt.
◦ Task Status. In this view, the main window shows all the tasks that Operations
Manager carried out, such as discovering computers, installing agents, and
executing monitoring probes. The details pane shows the output from each task
as you select it in the main Task Status list. If the details pane is not visible, click
Detail Pane on the View menu. Right-click a task, and then click Health Service
Tasks to see a list of the many tasks you can execute. These include a range of
configuration, probe, discovery, recovery, and execution tasks.
3. The first four of the basic monitoring categories listed in the previous step provide
views containing more detailed information:
◦ Alert view. This shows a list of active alerts for only the selected computer or
application.
◦ Diagram view. This shows a schematic representation of this computer or
application. This is a useful view for understanding the structure of a distributed
application or series of hierarchical groups (see Figure 13). It shows the overall
health state for each component as well as the application as a whole, and you
can expand and collapse the nodes to explore where any problems or
performance issues exist. Right-click any of the components, and then click
Health Explorer to open a window that contains a tree view where you can
explore the individual rules and monitors for the entire application; you can also
see details of the state of each one and the associated knowledge that helps to
verify, diagnose, resolve, and re-verify any problems.
Figure 13
Viewing the schematic structure and state of a distributed application in Diagram
view
◦ Event view. This shows details of the source events for the computer or
application. Right-click an event, and then select Show associated rule
properties to see the rules associated with the event. The details pane shows
the properties of the selected event.
◦ Performance view. This shows the performance counters available for a
computer or application. Select a counter from the list to see a graph of the
values over time (see Figure 14). This window contains commands on the
Actions menu that allow you to select the time range, copy or save the graph
image, and copy the source data to the clipboard for further examination and
analysis. For a baseline counter, you can also pause or restart a collection, or
you can reset the baseline values.

Figure 14
Viewing the history for a performance counter in Performance view
◦ State view. This shows the overall state of the computer or application. This
window shows the state and properties of the selected computer or application,
and contains commands to show the Health Explorer window and view reports.
◦ Other options available in the views listed in steps 2 and 3 allow you to start
Maintenance mode for an application or a computer, or you can create
personalized views with specific columns, grouping and ordering to suit your
requirements.
4. In any view, select a distributed application or a computer and double-click to open
the Health Explorer window. In the left pane tree view, expand the nodes to show
the overall state for the application or computer (the Entity Health node). Within
this node, depending on the structure of your application, you see the rolled-up
health state and the individual category health states for each component. Figure 15
shows the health state for an example distributed application.

Figure 15
The Health Explorer window for a distributed application
5. As you select each node in the Health Explorer tree view, the Knowledge tabbed
page in the right pane shows the product and company-specific knowledge for that
node. The State Change Events tab page shows a list of the events that caused
changes to the state, and the event context.
6. Examine the other views available in Monitoring mode to see a high-level view of the
computers and the applications they are running, and the overall health state of
each one. Expand the nodes in the tree node in the left pane of the main Operations
Console window for the category of information you are interested in. Available
categories include agentless monitored computers, Windows client computers,
Windows Server computers, Web applications, network devices, Operations
Manager itself, and any application groups you have created. Select the State node
within category group to see an overall view of the state for that category. You can
then right-click entries in the main window to see the different views (described in
step 3), or open Health Explorer or the PowerShell command prompt.

To use the Web Console to connect over an external network such as the Internet
1. On the taskbar, click Start, point to System Center Operations Manager 2007, and
then click Web Console. The Web Console provides only monitoring features and
displays a much simpler interface for selecting and viewing information (see Figure
16).

Figure 16
The Web Console provided with System Center Operations Manager 2007
2. The left pane tree view displays only four basic categories and a reduced set of other
monitoring categories. However, it still provides a wealth of monitoring capabilities,
and works in much the same way as the standard Operations Console.

Guidelines for Viewing Management Information in Operations


Manager 2007
When viewing management information in Operations Manager 2007, consider the following
proven practices:

• If you connect directly to the management domain, use the Operations Console to
monitor applications and computers. If you connect from a remote location over the
Internet or an intranet, use the Web Console to monitor applications and computers.
• Use the Scope option on the View menu to limit your view to the appropriate
distributed applications or groups and subgroups, unless you want to see alerts raised
by all the managed computers for all events.
• Use the State view and the Diagram view to provide an overall picture of the health
state of the application. In Diagram view, you can also see the state of the subgroups
and individual computers.
• Use the Alerts view to obtain a list of alerts, generally sorted by descending severity,
which is useful in prioritizing diagnosis and resolution requirements, and the
corresponding actions.
• Use the Events view to see the details of source events, and use the Performance view
to see the values and history of performance counter samples. Both are useful in
diagnosing problems and verifying resolution.
• Use the Health Explorer to see the state of individual components, individual categories
(such as Configuration or Performance), and individual monitors and rules.
• Create personalized views if you want to see information displayed in a different order
or in different groups.

Creating Management Reports in Operations Manager 2007


Regular and continuous monitoring makes it easier to detect application failures, problems, or
unsatisfactory performance, but the actions taken by administrators and operations staff are
usually short-term in nature. They tend to concentrate on the present, and may not take into
account historical events and performance over longer periods that indicate fundamental issues
or causes.
However, business owners and hosting services must often conform to specified service level
agreements (SLAs) about performance and availability. The data required to support these
agreements only appears over longer periods and requires access to historical information.
Data captured in summary reports can also be vital to operations staff in detecting missed
computers, or incorrectly configured application or computer groups, particularly in large and
complex installations. These reports may be the only way that operations staff can match
monitoring infrastructure to the physical hardware.
System Center Operations Manager 2007 includes a report generator that uses SQL Server
Reporting Services to publish the stored performance and error details within its own database.
This can provide useful summaries of application performance, and the history of issues
encountered with an application. You can use the reports to view the overall performance over
time and detect specific problem areas with your application.

The reporting feature for System Center Operations Manager 2007 is a separate installation
from the monitoring system. You must rerun the setup for Operations Manager and select
Operations Manager 2007 Reporting to install the reporting feature.
To view monitoring and management reports in Operations Manager 2007
1. On the taskbar, click Start, point to System Center Operations Manager 2007, and
then click Operations Console. There are two ways to create a report. To view
information for a single computer or a single distributed application, go to step 2 of
this procedure. To view information for multiple computers, distributed applications,
or other entities, go to step 5 of this procedure
2. To view information for a single computer or a single distributed application, click
the Monitoring button in the navigation pane at the lower-left of the window. If the
navigation pane is not visible, click Navigation Pane on the View menu.
3. In the left pane tree view, select either the computer you want to view information
for in the Computers section or the application you want to view information for in
the Distributed Applications section.
4. The actions pane to the left of the main window contains a series of links to the
popular types of report. If you cannot see the actions pane, click Actions on the
View menu. Click the report you want to generate to open the Report Viewer
window. Now go to Step 7 of this procedure.
5. To view information for multiple computers, distributed applications, or other
entities, click the Reporting button in the navigation pane at the lower-left of the
window. Note that the Reporting button is not available until you install the
reporting feature for Operations Manager 2007.
6. In the left pane tree view, expand the Reporting node, and then click Microsoft
Generic Report Library. Right-click a report in the list in the main window, and then
click Open to open the Report Viewer window.
7. The Report Viewer window contains a series of controls where you specify the
period for the report, the objects to include, and any other parameters specific to
that report. For example, when you open the Alerts report, you can specify the
severity and priority of alerts that the report will include. Figure 17 shows these
parameter settings and the other Report Viewer controls, and the way that you can
select the period for the report.

Figure 17
The Report Viewer showing the controls for the parameters for the report
8. If you specified a computer or a distributed application and opened Report Viewer
from the Monitoring section of the Operations Manager console, the Objects list in
Report View will contain the item you selected. If you opened Report Viewer from
the Reporting section of the Operations Manager console, the Objects list will be
empty.
9. To add items to the Objects list, click the Add Group or Add Object button. In the
dialog box that opens, select a search option in the drop-down list, such as Contains
or Begins with, and then enter the text part of the name of the object(s) or group(s)
you want to find. If you want to specify the dates between which objects or groups
were created, or the management group they belong to, click the Options button,
and then enter the relevant details.
10. Click the Search button to view all matching items in the Available items list. Select
those you want to include in the report (you can hold down the SHIFT and CTRL keys
to select multiple items in the list), and then click the Add button to add them to the
Selected objects list. Then click OK to return to Report Viewer.
11. Set any other parameter values you require in the controls at the top of the Report
Builder window, and then click the Run button on the main toolbar to start the
report running. After a few moments, the report appears (see Figure 18).

Figure 18
The results of running the Event Analysis report for two computers
The reports included with Operations Manager 2007 allow you to view alerts and alert latency;
availability and health; custom configuration and configuration changes; event analysis, most
common events, and custom events; and performance and health details. You can also author
your own reports, and set up scheduled reporting. Figure 19 shows the graphical reports for
alert latency over 1 second.
Figure 19
The results of running the Alert Latency report for all alerts during one day

Guidelines for Creating Management Reports in Operations


Manager 2007
When creating management reports in Operations Manager 2007, you should consider the
following guidelines:
• Use System Center Operations Manager Report Viewer to examine the historical
performance of an application to ensure that it performs within the service level
agreements (SLAs) or the parameters defined by business rules.
• Use the reports to discover inconsistencies in performance, check overall reliability, and
detect problematic situations such as unreliable networks—and the times when these
issues most commonly arise.
• Use the reports to confirm management coverage of the computers running the
application, and deployment of the appropriate sets of rules to each group.

Summary
Management Packs can be a very useful tool for the operations team in managing applications.
This chapter demonstrated how to create and import Management Packs in Operations
Manager 2007, and then it described how to edit the Management Packs to provide the
functionality required when monitoring an application.
Section 5
Technical References
This section provides additional technical resources that can be of use when designing and
developing manageable applications. Chapter 18, "Design of the DFO Artifacts," is incomplete in
the preliminary version of this guide. Chapter 19 describes how to create or modify a guidance
package to modify the application management model defined in the Team System
Management Model Designer Power Tool (TSMMD).
This section is aimed primarily at solutions architects and application developers.
Appendix A, "Building and Deploying Applications Modeled with the TSMMD"
Appendix B, "Walkthrough of the TSMMD Tool"
Appendix C, "Performance Counter Types"
Appendix A
Building and Deploying Applications
Modeled with the TSMMD
In this preliminary version of the guide, this chapter provides guidance on how you can consume
the instrumentation artifacts generated by the Team System Management Model Designer
Power Tool (TSMMD) in your applications, and how you can deploy the applications complete
with the appropriate instrumentation. This chapter also explains how you can generate
Management Packs for System Center Operations Manager using the TSMMD. The topics in this
chapter are:
• Consuming the Instrumentation Helper Classes
• Verifying Instrumentation Coverage
• Removing Obsolete Events

• Deploying the Application Instrumentation

• Specifying the Runtime Target Environment and Instrumentation Levels


• Generating Management Packs for System Center Operations Manager 2007

• Importing a Management Pack into System Center Operations Manager 2007


• Using Management Packs with System Center Operations Manager
• Creating a New Distributed Application

Consuming the Instrumentation Helper Classes


After you generate the instrumentation helper classes for a model, you can make calls to these
classes in your application. The abstraction of the instrumentation into separate classes makes it
easier to focus on the application code without having to worry about the instrumentation
requirements. If the model changes, you can regenerate the helper classes and use them
without requiring changes to the application code (providing that the existing instrumentation
still exists in the model).
If you change the name of a managed entity and regenerate the instrumentation helper
classes, you must update the references and your application code to match the new
instrumentation helper class names.

To call the instrumentation helper classes


1. Open the TSMMD solution in Visual Studio 2008, and then open Solution Explorer. If
you cannot see Solution Explorer, click Solution Explorer on the View menu.
2. In Solution Explorer, right-click the top level solution entry, point to Add, and then click
New Project. Select the required project type, such as Windows Forms Application, and
then enter the name for the project and any other required information.
3. In Solution Explorer, right-click the new project, and then click Add Reference. In the
Add Reference dialog box, click the Projects tab, and then select the [entity-name].API
and [entity-name].[target-environment].Impl projects for all of the managed entities in
your model. For example, if you have two managed entities named DatabaseEntity and
WebsiteEntity and two target environments named HighTrust and MediumTrust, you
would select the following projects:
◦ DatabaseEntity.API
◦ DatabaseEntity.HighTrust.Impl

◦ DatabaseEntity.MediumTrust.Impl
◦ WebsiteEntity.API

◦ WebsiteEntity.HighTrust.Impl
◦ WebsiteEntity.MediumTrust.Impl
4. In the code of the application that consumes the instrumentation, call the methods of
the instrumentation helper classes to raise events or increment performance counters.
For example, to raise an event named DatabaseFailedEvent that takes as a parameter
the name of the database, you can use code like the following.
C#
DatabaseEntity.API.DatabaseEntityAPI.GetInstance().RaiseDatabaseFailedEv
ent("SalesDatabase");
Visual Basic
DatabaseEntity.API.DatabaseEntityAPI.GetInstance().RaiseDatabaseFailedEv
ent("SalesDatabase")

5. To increment a performance counter, you call either the Increment[measure-name] or


the IncrementBy[measure-name] method of the instrumentation helper class. For
example, to increment a counter named OrdersProcessedCounter, you can use code
like the following.
C#
// increment counter by the default value
WebsiteEntity.API.WebsiteEntityAPI.GetInstance().IncrementOrdersProcesse
dCounter();
// increment counter by a specified value
WebsiteEntity.API.WebsiteEntityAPI.GetInstance().IncrementByOrdersProces
sedCounter(5);
Visual Basic
' increment counter by the default value
WebsiteEntity.API.WebsiteEntityAPI.GetInstance().IncrementOrdersProcesse
dCounter()
' increment counter by a specified value
WebsiteEntity.API.WebsiteEntityAPI.GetInstance().IncrementByOrdersProces
sedCounter(5)

For a detailed description of the instrumentation projects and artifacts, see Chapter 8 "Creating
Reusable Instrumentation Helpers".

Verifying Instrumentation Coverage


After developers add the instrumentation classes to a project and call them from the
application, they can perform a validation check in Visual Studio to ensure that the application
code does in fact call the instrumentation methods of the generated API classes. The verification
check confirms that the code makes at least one call to an overload of every method defined in
the instrumentation classes. Developers can also use the verification process to provide a
checklist of tasks when instrumenting applications. Figure 1 shows a case where the application
does not make calls to the helper methods.
Figure 1
The error list generated by the Verify Instrumentation Coverage recipe
To verify instrumentation coverage for a project
1. In Visual Studio, ensure that the TSMMD guidance package is enabled:
a. On the Tools menu, click Guidance Package Manager.
b. In the Guidance Package Manager dialog box, click the Enable/Disable
Packages button.
c. In the Enable and Disable Packages dialog box, select the TSMMD
Instrumentation and TSMMD Management Pack Generation check boxes.
d. In the Enable and Disable Packages dialog box, click OK, and then click Close in
the Guidance Package Manager dialog box.
2. In Solution Explorer, right-click the .tsmmd model file, and then click Verify
Instrumentation Coverage (C#).
3. The TSMMD looks in your solution projects for calls to all of the abstract
instrumentation methods defined in the instrumentation helper classes. Any missing
calls (instrumentation methods that you do not call from the application) appear in the
Visual Studio Error List window.
In the current release of the TSMMD, you can only verify coverage for applications written in
Visual Basic and C#. If you create your application using any other language, the TSMMD will
not be able to locate calls to the instrumentation, and will report an error.

An additional limitation in this release is that the instrumentation discovery process will not
locate instrumentation in an ASP.NET Web application written in Visual Basic.

Deploying the Application Instrumentation


When deploying and installing an application instrumented using the TSMMD tool, you must
also install the instrumentation used by the application. You achieve this by building your
solution in the usual way, and then running the installation utility against each instrumentation
technology DLL. The installation utility InstallUtil.exe is part of the default installation of the
.NET Framework.
Depending on the instrumentation defined in your management model, you may need to install
one or more of the following technology DLLs:
• EventLogEventsInstaller.dll
• WindowsEventing6EventsInstaller.dll
• WmiEventsInstaller.dll

• PerformanceCountersInstaller.dll

If you include Enterprise Library Log Events in your model, the configuration file created by the
TSMMD will contain the configuration information that Enterprise Library requires. You must
copy this into your application configuration file, as described in the section Specifying the
Runtime Target Environment and Instrumentation Levels. You must also ensure that Enterprise
Library is installed on the target computer(s) where you deploy your application.

Installing Event Log Functionality


Before your application can write event log entries, you must specify settings for the event log in
the Windows registry. These changes require administrative rights over the local computer, so
should occur when the application is installed, and not at runtime.
The instrumentation generation process creates a suitable EventLogEventsInstaller class in the
EventLogEventsInstaller subfolder. You can use the InstallUtil.exe utility with the
EventLogEventsInstaller class in to install the event logs with your application.

The EventLogEventsInstaller class can install event logs only on the local computer.
Installing Windows Eventing 6.0 Functionality
The TSMMD creates a Windows Eventing 6.0 manifest file if the model defines any Windows
Eventing 6.0 events. Before your application can write event log entries, you must install the
publisher file, including the manifest, on the target system. To do this, you use the Wevtutil.exe
utility. The command you must execute on the target system is:

wevtutil install-manifest EventsDeclaration.man

You will usually execute this command during the installation process for your application. The
Wevtutil utility can usually be executed only by members of the Administrators group, and must
run with elevated privileges.
The TSMMD can also generate a Windows Eventing 6.0 view that allows you to display events
from your application in a custom view of the Event Log. To create a Windows Eventing 6.0
view, right-click on the top-level entry in the Management Model Explorer window and click
Generate Windows Eventing 6.0 View. The TSMMD creates a new XML view file named [model-
name]View.xml and opens it in Visual Studio.

Publishing the Schema for an Instrumented Assembly to WMI


If your application instrumentation definition includes any WMI Events, you must register the
appropriate WMI schema in the WMI repository. The instrumentation generation process
creates a suitable WmiEventsInstaller class in the WmiEventsInstaller subfolder. You can use the
InstallUtil.exe utility with the WmiEventsInstaller class in to install the events with your
application.

As a convenience for developers at design time, Windows automatically publishes a WMI


schema the first time an application raises an event or publishes an instance. This avoids the
requirement to declare a project installer and run the InstallUtil.exe utility during prototyping
of an application. However, this registration will succeed only if the user invoking it is a
member of the Local Administrators group, and therefore you should not rely on this approach
as a mechanism for publishing the schema.

Installing Performance Counters


If your application instrumentation definition includes any Windows Performance Counters, you
must register these before the application can use them. The instrumentation generation
process creates a suitable PerformanceCountersInstaller class in the
PerformanceCountersInstaller subfolder. You should edit this file before use to set the value of
the CounterHelp property for each counter instance. This value determines the help text shown
in the Performance Counter viewer and monitoring tools that display the counter values. You
can use the InstallUtil.exe utility with the PerformanceCountersInstaller class in to install the
counters with your application.
Using a Batch File to Install Instrumentation
The easiest way to install instrumentation using the classes described in the previous sections is
with a batch file that executes the Installutil.exe utility. The following listing shows an example
batch file.

InstallUtil
Instrumentation\EventLogEventsInstaller\bin\Debug\EventLogEventsInstaller.dll
InstallUtil
Instrumentation\PerformanceCountersInstaller\bin\Debug\PerformanceCountersInst
aller.dll
InstallUtil
Instrumentation\WmiEventsInstaller\bin\Debug\WmiEventsInstaller.dll

Using the Event Messages File


In order for the text of event messages to appear in Windows Event Log, you must register the
assembly containing these messages on the target system. You can use the event messages file
that the TSMMD generates.
To install the Event Messages file
1. Add a reference to the EventLogEventsInstaller.dll located in the Instrumentation folder
to your application project. This allows Visual Studio to copy this assembly, which
contains install information, into the execution directory of the application.
2. In Solution Explorer, right-click on the top-level solution entry and click Rebuild All.
3. In Windows Explorer, navigate to the Installation subfolder of your solution, and copy
the file Source1_Messages.dll file from the output directory of the
EventLogEventsInstaller project into the execution folder of your application.
4. Open a Visual Studio Command Prompt window, navigate to the execution folder of
your application, and register the event messages assembly by executing the following
command:

InstallUtil EventLogEventsInstaller.dll /i

Specifying the Runtime Target Environment and Instrumentation


Levels
The code generation process creates the instrumentation code configuration file, named
InstrumentationConfiguration.config, in the Instrumentation folder of the project. This file,
shown in the following listing, contains the information that developers or operators will copy
into their application configuration files, and edit to specify the target environment under which
the application will run and the required instrumentation granularity.

<configuration>
<configSections>

<section name="tsmmd.instrumentation"
type="Microsoft.Practices.DFO.Guidance
.Configuration.ApplicationHealthSection,
Microsoft.Practices.DFO.Guidance.Configuration"/>

<!-- this section included if model contains Enterprise Library Events -->
<section name="loggingConfiguration"
type="Microsoft.Practices.EnterpriseLibrary
.Logging.Configuration.LoggingSettings,
Microsoft.Practices.EnterpriseLibrary.Logging, Version=3.1.0.0,
Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a" />

</configSections>

<tsmmd.instrumentation>
<!--
Attribute "targetEnvironment" can have values:
Extranet
LocalIntranet

Attribute "instrumentationLevel" can have values:


Off
Coarse
Fine
Debug
-->

<add name="CustomerDatabase" targetEnvironment="LocalIntranet"


instrumentationLevel="Coarse"/>
<add name="CustomerWebService" targetEnvironment="LocalIntranet"
instrumentationLevel="Coarse"/>

...
... all other managed entities in model listed here ...
...

</tsmmd.instrumentation>

<!-- this section included if model contains Enterprise Library Events -->
<loggingConfiguration name="Logging Application Block" tracingEnabled="true"
defaultCategory="General"
logWarningsWhenNoCategoriesMatch="true">
...
... default logging configuration here ...
...
</loggingConfiguration>
</configuration>

Developers copy the contents of this file (excluding the <configuration> element) into their
application configuration file and edit the values as required. The <tsmmd.instrumentation>
element contains an <add> element for each managed entity in the model, identified by the
entity name. Each <add> element defines two other attributes:
• targetEnvironment. This is one of the target environments defined in the model, and
controls which of the concrete event and measure (counter) implementations the
abstract API class methods will use in the application at runtime. It defines mapping
between the target environments in the model and the concrete event and measure
implementations.
• instrumentationLevel. This indicates the level at which the instrumentation will raise
events or increment counters. Every abstract event and measure in the original model
defines a value for its Instrumentation Level property. The options are Coarse (all
operations), Fine (diagnostic and debug operations only), Debug (debug operations
only), and Off (instrumentation disabled).

The following table shows how the combination of the Instrumentation Level property of an
event and the setting of the instrumentationLevel attribute in the configuration file affects the
raising of events.

Instrumentation level Overall instrumentation Event raised?


(event) level

Coarse Coarse Yes

Coarse Fine Yes

Coarse Debug Yes

Fine Coarse No

Fine Fine Yes

Fine Debug Yes

Debug Coarse No

Debug Fine No

Debug Debug Yes

Off Coarse No

Off Fine No

Off Debug No
The instrumentation configuration file created by the TSMMD code generation routines contains
settings that specify the runtime target environments and instrumentation levels for the
managed entities within the application. When you deploy your application, you must copy the
contents of the instrumentation configuration file into your application configuration file and
edit it to specify the appropriate settings.
The configuration file, named InstrumentationConfiguration.config, resides with the generated
instrumentation classes in the Instrumentation folder of the TSMMD solution. It contains in the
<tsmmd.instrumentation> section an <add> element for each managed entity in your
application. This element defines the target environment within which that entity will execute,
and the granularity of the instrumentation. You must copy the contents of this file (excluding
the <configuration> element) into your application configuration file and edit the values as
required.
If you specified any Enterprise Library Log Events in your model, you must also copy the entire
<loggingConfiguration> section, and the corresponding <section> element from the
<configSections> section, into your application configuration file.

Remember that the term "target environment" refers to the capability for specifying multiple
events or performance counters for an aspect of an entity, and having the entity use a specific
one of these events or counters at runtime depending on the requirements of the application,
the execution permissions available, and the limitations of the runtime environment.

To specify runtime target environment and instrumentation levels


1. Open the file named InstrumentationConfiguration.config in Visual Studio or any other
text or XML editor.
2. Open your application configuration file (usually Web.config or App.config) in Visual
Studio or any other text or XML editor.
3. If your application configuration file already contains a <configSections> element, copy
only the <section name="tsmmd.instrumentation" ... /> element (including all of its
attributes) from the InstrumentationConfiguration.config file into the <configSections>
element in your application configuration file. If you are using any Enterprise Library Log
Events in your model, you must also copy the <section name=" loggingConfiguration"
... /> element (including all of its attributes) from the
InstrumentationConfiguration.config file into the <configSections> element in your
application configuration file.
If your application configuration file does not contain a <configSections> element, copy
the entire <configSections> element from the InstrumentationConfiguration.config file
into your application configuration file.
4. Copy the entire <tsmmd.instrumentation> element from the
InstrumentationConfiguration.config file into your application configuration file, and
place it within the root <configuration> element but outside all other elements.
5. If you are using any Enterprise Library Log Events in your model, you must also copy the
entire <loggingConfiguration> element from the InstrumentationConfiguration.config
file into your application configuration file, and place it within the root <configuration>
element but outside all other elements.
6. Close the InstrumentationConfiguration.config file.
7. In the <tsmmd.instrumentation> section of your application configuration file, locate
the <add> element for the managed entity you want to configure.
8. Edit the value of the targetEnvironment attribute to specify the name of one of the
target environments defined in the management model. A target environment maps
one of the concrete instrumentation implementations to its abstract event or measure.
Therefore, setting this attribute specifies which of the concrete implementations the
instrumentation helper code will call when the application code executes the methods
of the abstract instrumentation class.
9. Edit the value of the instrumentationLevel attribute to specify the granularity of the
instrumentation defined in the management model. The values you can use are Coarse,
Fine, Debug, and Off. Setting this attribute to one of these values controls how the
instrumentation will behave.
10. Save and close your application configuration file.

Generating Management Packs for System Center Operations


Manager 2007
If you intend to monitor your application using Microsoft System Center Operations Manager
2007, you can use your management model to generate a management pack that you can install
into Operations Manager. The management pack will contain a template for a distributed
application; base classes and components for each managed entity in your model; and the
monitors and rules required to monitor the application.
The TSMMD can generate management packs automatically when you build an application, or
you can generate management packs from the command line. For more information about
generating management packs from the command line, see the documentation installed with
the TSMMD.
To generate a management pack for System Center Operations Manager 2007
1. In Visual Studio, ensure that the TSMMD guidance package is enabled:
a. On the Visual Studio Tools menu, click Guidance Package Manager.
b. In the Guidance Package Manager dialog box, click Enable/Disable Packages.
c. In the Enable and Disable Packages dialog box, make sure that the Team
System MMD Instrumentation and TSMMD Management Pack Generation for
OpsMgr 2007 check boxes are selected.
d. Click OK and Close to return to the management model designer.
2. In Management Model Explorer, right-click the top level node, and then click Generate
Management Pack for OpsMgr2007. Alternatively, right-click anywhere on the model
designer surface, and then click Generate Management Pack for OpsMgr2007. If you
cannot see Management Model Explorer, point to Other Windows on the View menu,
and then click Management Model Explorer.
3. By default, unless you changed the name of the model or the settings in the properties
page for the project, the TSMMD generates a management pack named
Application.Operations.xml in a folder named ManagementPack within the same folder
as your project. Open this folder in Windows Explorer, and open the new management
pack in a text editor to view the contents.
4. To import a TSMMD-generated management pack into System Center Operations
Manager, you must also import the standard management packs upon which the
generated management pack depends.

You can also specify the properties required for the management pack in the properties window
for the TSMMD project, and then have the TSMMD create the management pack automatically
when you build the project.
To change the settings for automatic System Center Operations Manager management
pack generation
1. In Solution Explorer, right-click the TSMMD project entry and click Properties to open
the project properties window in the main Visual Studio editor pane. The Settings page
of the project properties determines if the TSMMD will automatically generate a
management pack when you build the TSMMD project, and the parameters for the
management pack generation process
2. To enable automatic generation of a management pack, set the checkbox named
Enable Microsoft SCOM 2007 management pack generation at the top of the General
section.
3. Edit the default values in the text boxes below this checkbox as required. You can
specify the following properties:

◦ Management pack ID. This setting is the fully qualified identifier for the
management pack that the TSMMD will automatically generate. The default is
Application.[model-name]. The name must start with a letter or a number, and
contain only letters, numbers, periods, and underscore symbols. The total
length must less than 255 characters, and the value must be unique within the
scope of the System Center Operations Manager server to which you will import
the management pack.
◦ Management pack display name. This setting is the name for the management
pack. The default is the current management model name.
◦ Default namespace. This setting is the namespace in which the management
pack will reside. The default is Application.

◦ Output path. This setting is the full path for the generated management pack.
Click the Browse button next to the Output path text box and select the folder
where you want to create the management pack. The default is a folder named
ManagementPack within your project folder.

Importing a Management Pack into System Center Operations


Manager 2007
After you create a management pack using the management model designer, you can import
the management pack into Microsoft System Center Operations Manager 2007.
To import a management pack into System Center Operations Manager
1. In the System Center Operations Manager Operations Console, click the Administration
tab in the left pane.
2. Right-click the top-level Administration item in the left tree view pane, and then click
Import Management Packs.
3. In the Select Management Packs to Import dialog box, navigate to the folder that
contains the management pack you created in the TSMMD, select the management
pack, and then click Open.
4. The Import Management Packs dialog box shows the management pack you selected. If
there are any prerequisites or referenced management packs that are not already
installed, the dialog box displays a warning and details of the required packs (for more
information, see the next section, "Prerequisite Management Packs"). If this warning
appears, click the Add button at the top of the dialog box, and then locate and select
the required management packs.
5. The Import Management Packs dialog box analyzes the selected management pack(s)
and indicates whether it can successfully import them. After the dialog box shows that
the analysis succeeded (every management pack has a green check mark in the list),
click Import.
6. In the Administration tree view, click the Management Packs node to see a list of all
installed management packs.
Prerequisite Management Packs
If your management pack includes references to ASP.NET applications or Web services, you
must also import the following system infrastructure management packs if you have not already
done so:
• Microsoft.Windows.InternetInformationServices.CommonLibrary.mp
• Microsoft.Windows.InternetInformationServices.2003.mp

• Microsoft.SystemCenter.ASPNET20.2007.mp

The first two of these management packs are part of the Microsoft Windows Server 2000/2003
Internet Information Services Management Pack, which you can obtain from the Microsoft
Download Center. The third of the management packs in the previous list is provided with
System Center Operations Manager, and can be found in the %Program Files%\System Center
Operations Manager 2007\Health Service State\Management Packs folder.

Creating a New Distributed Application


Mapping a managed model to a distributed application creates one instance of distributed
application. This instance contains all instances of the managed entities. However, in some
cases, a distributed application should contain only a subsection of the managed application,
and possibly another distributed application should contain the remaining subsections. To
achieve this separation, administrators and operators must create several identical distributed
applications using application components from different classes.
It is possible to create several distributed application with same architecture and attach
different instances of classes (managed entities) to different distributed applications (for
example, separate environments for testing and production). To achieve this, you create a new
distributed application based on the distributed application template created by the TSMMD.
To create distributed applications using the management model template
1. Import the management pack generated by the TSMMD into System Center Operations
Manager as described in the earlier procedure "Importing a Management Pack".
2. In the System Center Operations Manager Operations Console, click the Authoring tab
in the left pane. If you see the Overview page describing tasks required, click the Go to
Distributed Applications link.
3. Click the Create a new distributed application icon on the toolbar or the Create a new
distributed application link in the Actions pane on the right of the main console
window.
4. This starts the Distributed Application Designer wizard. Enter a name and description
for the distributed application. Then, in the Template list, select template that the
TSMMD generated within the management pack. The name of this template is
"Template for your-model-name".
5. Specify the location to store the management pack for the new distributed application.
This should be the management pack generated by the TSMMD, which contains the
template. If this management pack is sealed, select an existing management pack that is
not sealed.
6. Click OK. Then, when the wizard finishes, the Operations Manager Distributed
Application Designer window contains a diagram similar to the TSMMD model. This
diagram shows the components of the distributed application, and the left pane of the
window shows lists of class instances by component type.
7. Drag a class instance from the list onto the component in the diagram. You can only
attach one component type to any instance of an Executable Application component in
the designer. However, the Windows Service, ASP.NET Web Application, and ASP.NET
Web Service components have extensions. You can attach the base class and the
extension class to these types of component, as shown in Figure 2.

Figure 2
Specifying class instances for components in the Distributed Application Designer

8. The designer will create the common dependency and roll up monitors for the
distributed application. However, you can delete some components if required; for
example, if you are creating separate environments for testing and production.
9. Click Save to create the new distributed application, or to save your changes if you are
editing an existing distributed application.
10. Unlike the original distributed application, you can modify the distributed application
afterwards if required by using the Operations Manager management console.

For details of how to edit and use a management pack for an application, see Chapter 17
"Creating and Using System Center Operations Manager 2007 Management Packs".
Appendix B
Walkthrough of the Team System
Management Model Designer Power
Tool
This topic contains a simple hands-on demonstration of the Team System Management Model
Designer Power Tool (TSMMD) that will help you understand what it does and how you can use
it.
Note that this walkthrough describes the minimum set of steps required to build a management
model and health definition, generate instrumentation, and generate a System Center
Operations Manager 2007 management pack. It does not implement good programming
practices, but it will serve as a valuable starting point for understanding the DFO process and the
TSMMD.
The process divides into discrete sections, so that you can complete as many as you want.
However, you must complete the first section if you want to generate the instrumentation code
and an Operations Manager management pack. The following are the four sections:

• Building a Management Model


• Generating the Instrumentation Code
• Testing the Model with a Windows Forms Application
• Generating an Operations Manager 2007 Management Pack

Building a Management Model


The first task is to build a management model using the Team System Management Model
Designer Power Tool (TSMMD). You will create a new TSMMD solution, and then you will create
the graphical model and specify the instrumentation for it.
To create the new TSMMD solution
1. Start Visual Studio 2008 Team System Edition, click the File menu, point to New, and
then click Project.
2. In the New Project dialog box, click TSMMD Project in the list of project types, and then
click TSMMD Project in the list of projects. Enter a name and location for the new
project and click OK. This creates a new TSMMD project containing a new management
model named operations.tsmmd. The Management Model Explorer window appears
showing this new empty model, and the blank model designer surface appears in the
main window.

If you cannot see the Management Model Explorer window, click the View menu,
point to Other Windows, and click ManagementModel Explorer.

3. Ensure that the guidance packages for the TSMMD are loaded. To do this, click
Guidance Package Manager on the Visual Studio Tools menu. If the list of recipes in the
Guidance Package Manager dialog box does not contain any entries that apply to Team
System Management Model, follow these steps to enable the recipes:
◦ Click the Enable/Disable Packages button.
◦ Select the two guidance packages named Team System MMD Instrumentation
and Team System MMD Management Pack Generation.

◦ Click OK to return to the Guidance Package Manager dialog box.


◦ Click Close to close the Guidance Package Manager dialog box.

If you do not see the two guidance packages in the list, you may need to reinstall the
TSMMD guidance package.

4. In Management Model Explorer, select the top-level item named Operations. In the
Visual Studio Properties window, change the Name property to MyTestModel, and then
enter some text for the Description and Knowledgebase properties. If you cannot see
the Properties window, press F4.
5. In Management Model Explorer, expand the Target Environments node and select the
target environment named Default. Change the value of the Event Log property to True
to indicate that you require instrumentation that writes to the Windows Event Log.

You use the properties of a target environment to specify that you require any
combination of Enterprise Library Logging events, Windows Event Log events, trace file
events, Windows Eventing 6.0 events, Windows Management Instrumentation (WMI)
events, and Windows performance counters for that target environment. You can also
add more than one target environment to a model to describe different deployment
scenarios.

The next stage is to create the graphical representation of the application entities.
To create the new management model
1. In Management Model Explorer, right-click the top-level MyTestModel entry, then click
New Managed Entity Wizard. Enter the name CustomerApplication for this entity,
select Executable Application in the drop-down list, type a description for this entity in
the Description box, as shown in Figure 1, and then click Next.
Figure 6
First page of the Add New Managed Entity wizard

Alternatively, you can right-click the top-level MyTestModel entry and then click Add
New Executable Application or you can drag an Executable Application control from
the Toolbox onto the designer surface and then edit the properties in the Properties
window.

2. On the Specify Managed Entity properties page of the wizard, make sure FilePath is
selected in the Discovery Type box, then type %Program
Files%\CustomerApplication.exe in the Discovery Target box as shown in Figure 2.
Monitoring systems such as System Center Operations Manager use the settings on this
page (which are exposed in the management pack you generate) to check whether the
application is installed on a specific target computer. Click Finish to create the new
CustomerApplication managed entity, which appears on the designer surface. The
Properties window shows the settings and values you entered in the wizard.
Figure 7
Last page of the Add New Managed Entry wizard

Some managed entity types, such as ASP.NET Application and ASP.NET Web Service,
have extender properties that specify additional settings for the management pack
generated by the TSMMD.

3. Drag an External Managed Entity control from the Toolbox onto the designer surface.
In the Properties window, change the value of the Name property to
CustomerDatabase.

The wizard does not allow you to create unmanaged entities because the only
property they have is the name. Unmanaged entities act as connectors or placeholders
for parts of the overall application or system that are outside the management scope.

4. In the Toolbox, click the Connection control, click the CustomerApplication entity, and
then click the CustomerDatabase entity. This creates the connection between the two
entities. You can edit or delete the Text property for the connection in the Visual Studio
Properties window.
5. In Management Model Explorer, expand the Managed Entities node to see the two
entities you added to the diagram. Notice that the External Managed Entity (named
CustomerDatabase) has no instrumentation or health sections. You do not create
instrumentation or health definitions for External Managed Entities. Figure 3 shows the
model at this stage.
6. On the Visual Studio File menu, click Save All.

Figure 3
The graphical representation of the application entities

The next stage is to populate the health definition and instrumentation sections of the
management model. The health model defines the health states for each entity as a series of
aspects and the indicators (the instrumentation) that causes a transition in these health states.
You can add events, measures, and aspects to the model individually and set their properties as
you develop and fine-tune your model. However, the TSMMD provides a wizard that helps you
create a new aspect and specify the associated instrumentation. When you build a complex
model, you will probably have to iterate through the process of using the wizard, and then
manually add and edit items in the graphical model as it evolves. However, the wizard makes it
easy to start adding instrumentation and health definitions to the model.
To add a health definition aspect and the associated instrumentation to the management
model
1. In Management Model Explorer, right-click the top-level MyTestModel entry, and then
click Validate All. The Visual Studio Error List window will show a warning indicating
that you must define at least one event or measure for the managed entity
(CustomerApplication).
This is a useful way to check that your model is valid as you work with it. You can also
validate individual sections of the model. For example, to check only the managed
instrumentation for this entity, right-click the Managed Instrumentation child node of
the CustomerApplication node in Management Model Explorer, and then click
Validate.

2. In Management Model Explorer, right-click the CustomerApplication node, and then


click New Aspect Wizard. Specify the following values in the first page of the wizard, as
shown in Figure 4, then click Next:
◦ Type NoDatabaseConnection in the Aspect name box

◦ Select Availability in the Aspect category drop-down list


◦ Type some explanatory text for the aspect in the Aspect knowledgebase box
◦ Click Event in the Aspect based on list
◦ Click Green-Red in the Aspect States list

Figure 4
First page of the Add New Aspect wizard
These settings specify that you want to implement a two-state health indicator for this
aspect, which will be driven by two events—one that indicates connection failed
(RED), and one that indicates connection available or restored (GREEN). If you want to
implement instrumentation that displays a warning, you select Green-Red-Yellow and
will therefore need to specify three events. Alternatively, you can base an aspect on a
performance counter by selecting Measure instead of Event.

3. On the next page of the wizard, you specify the events for the NoDatabaseConnection
aspect. Click the ellipsis button (...) next to the Green Health State Event text box to
open the Browse Events dialog box (shown in Figure 5). The dialog box is currently
empty because your model does not define any events.

Figure 5
The Browse Events dialog box where you select an existing event or create a new event

4. In the Browse Events dialog box, click New.


5. In the Create New Event dialog box, type NoDatabaseConnection in the Event Name
box, select Coarse in the Level drop-down list (if it is not already selected), and then
click OK, as shown in Figure 6. This adds the new event to the Browse Events list.
Figure 6
Create New Event dialog box

6. In the Browse Events dialog box, click NoDatabaseConnection in the Events list and
then click OK. This adds the NoDatabaseConnection event to the Green Health State
Event box of the Add New Aspect wizard.
7. Repeat the process for the Red Health State. To do this:
◦ Click the ellipsis button (...) next to the Red Health State Event text box

◦ Click New in the Browse Events dialog box


◦ Type DatabaseConnectionRestored in the Event Name box and select Coarse in
the Level drop-down list
◦ Click OK in the Create New Event dialog box
◦ In the Browse Events dialog box, click DatabaseConnectionRestored in the
Events list and then click OK.
8. This adds the DatabaseConnectionRestored event to the Red Health State Event box of
the Add New Aspect wizard. You now have events defined for both of the states of the
NoDatabaseConnection aspect, as shown in Figure 7.
Figure 7
The two new events specified for the NoDatabaseConnection aspect

9. Click Finish. The wizard creates the new aspect named NoDatabaseConnection and the
abstract event implementations NoDatabaseConnection and
DatabaseConnectionRestored. You can examine the new aspect and events in
Management Model Explorer, as shown in Figure 8.
Figure 8
The new aspect and events in Management Model Explorer

You can define parameters for events, which the instrumentation will populate and
expose to Windows event system when that event is raised. In this example, the two
events will pass the name of the database as a parameter to the events system.
Therefore, the next step is to define these parameters.

10. In Management Model Explorer, right-click the NoDatabaseConnection node, and then
click Add New Event Parameter. In the Properties window for the new parameter,
make sure the Index property is set to 1, and the Type property is set to String. Change
the Name property to DatabaseName.
11. Repeat this process for the DatabaseConnectionRestored event by adding a new
parameter and changing the Name property to DatabaseName. Figure 9 shows the
result.
Figure 9
The events and their parameters shown in Management Model Explorer

The two events you have defined are abstract events. Now you must create the
concrete implementations of these events. Each abstract event and measure must
have an implementation for every target environment in the model. In this example,
you use just the default target environment; therefore, you require only one concrete
implementation of each event.

12. In Management Model Explorer, right-click the NoDatabaseConnection node, and then
click New Event implementation Wizard. The first page of the wizard shows any
discovered (existing) events in your application and the managed implementations you
must create for each target environment (see Figure 10). There are no discovered
events in this example; and it shows only the single implementation technology you
specified when you created the model—Event Log—which is selected by default.
Figure 10
First page of the New Event Implementation Wizard

13. On this page of the wizard, click Next.


14. On the next page of the wizard, you specify the properties for the Event Log event, as
shown in Figure 11. For the example application:
◦ Type NoDatabaseConnectionEvent in the Name box

◦ Type Application in the Log Name box (if it is not already there)
◦ Select Error in the Severity list box if it is not already selected
◦ Leave the default setting in the Source box

◦ Type 9000 in the Event Id box


◦ Type Database name: %1 in the Message Template box

◦ Click Finish.
Figure 11
Last page of the New Event Implementation Wizard

The value Database name: %1 in the Message Template box is a string that will be
passed to the event system; it must contain a placeholder (%1) for the event
parameter you defined when you created the abstract event definition. In general, a
Message Template string must include a placeholder for each parameter you define
for an event. The placeholders must start with a "%1" and run consecutively up to the
number of parameters you define for that abstract event.

The wizard creates the configurable implementation of the event. You can view this
event in the Management Model Explorer and see the property values you specified in
the Properties window. Figure 12 shows both of these windows at this stage.
Figure 12
The concrete implementation of the NoDatabaseConnection event

15. Repeat this process to create an Event Log Event implementation for the
DatabaseConnectionRestored event. To do this, execute the New Event
implementation Wizard, change the event name to
DatabaseConnectionRestoredEvent, change the event ID to 9001, set the severity to
Information, and enter the value Database name: %1 in the Message Template box.

If you have more than one target environment in the model, the wizard will display a
dialog box to collect information for the appropriate event or measure
implementation(s) for each target environment.

16. To confirm that you have created a valid model, right-click anywhere in Management
Model Explorer, and then click Validate All. You should see the following message in the
Visual Studio Output window:

---- Validation started: Model elements validated: 58 ----


==== Validation complete: 0 errors, 0 warnings, 0 information messages
=====

17. On the Visual Studio File menu, click Save All.

You have now completed the simple health model definition and instrumentation definition for
the application, and you have validated the model. Figure 13 shows the model at this stage.
Figure 13
The management model showing the complete managed instrumentation definition

Of course, you will usually add more aspects to the model and specify the appropriate events
and measures (performance counters). Remember that the correct approach during application
and system design is to identify the health states and transitions first, which leads to the
definition of the instrumentation required to surface these transitions. This simple walkthrough
is designed to help you gain experience with the Team System Management Model Designer
Power Tool, so it assumes that you have previously identified the health states.

Generating the Instrumentation Code


With the management model now complete, you can use the Team System Management Model
Designer Power Tool to create the instrumentation code for your application. In this section of
the walkthrough, you will use a recipe within the TSMMD guidance package to create the helper
classes, instrumentation implementations, and the configuration information required for the
simple management model you created in the previous sections of the walkthrough.
To generate the instrumentation code for the application
1. In Management Model Explorer, right-click the top-level MyTestModel entry, and then
click Generate Instrumentation Helper. This first validates the model, and then it
generates the instrumentation projects, classes, and artifacts. When it is complete, you
will see the file InstrumentationConfiguration.config open in the main Visual Studio edit
pane. Close this before you continue with this walkthrough.
2. Open the Visual Studio Solution Explorer window. You will see a new folder named
Instrumentation that contains the instrumentation projects, classes, and artifacts.
3. Now you can use the TSMMD to verify the instrumentation coverage in the application.
In the Visual Studio Solution Explorer window, double-click the management model file
operations.tsmmd (in the project folder of your solution) to show the model designer
and Management Model Explorer.
4. In Management Model Explorer, right-click the top-level Management Model entry and
click Verify Instrumentation Coverage. You will see two errors in the Visual Studio Error
List window that indicate your application code does not invoke the two methods
defined in your generated instrumentation code. This is the expected result at this stage
because you have not yet built the application. You will do this is the next stage of the
walkthrough.
5. On the Visual Studio File menu, click Save All.

As you saw in this section of the walkthrough, the TSMMD can create the instrumentation
helper classes for an application and verify that your application actually does invoke all of the
instrumentation in the model. In other words, the application should raise every abstract event
and increment each abstract counter in at least one location in the code. Figure 1 shows the
instrumentation generated at this stage of the walkthrough.
Figure 1
The instrumentation projects, classes, and artifacts generated by the TSMMD

Testing the Model with a Windows


Forms Application
With the instrumentation classes now available, you are ready to create a minimal application,
connect the instrumentation, and verify instrumentation coverage. Then you can configure the
application, compile it, and run it to ensure that the instrumentation code works correctly.
To create the test application, and verify the instrumentation coverage
1. On the Visual Studio File menu, point to Add, and then click New Project.
2. In the Add New Project dialog box, expand the Visual C# entry in the project types list,
and then click Windows.
3. In the list of templates, click Windows Forms Application. Change the name to
CustomerApplication and change the location to specify a new subfolder named
CustomerApplication in your solution folder, as shown in Figure 1. Then click OK.

Figure 1
Adding a new application project to the solution

4. In Solution Explorer, right-click the References node in the new CustomerApplication


project, and then click Add Reference.
5. On the Projects tab of the Add Reference dialog box, hold down the CTRL key and click
CustomerApplication.API and CustomerApplication.Default.Impl (so that both are
selected), then click OK.
6. In the designer for Form1.cs, drag two Button controls from the Common Controls
section of the Toolbox onto the form. Change the Text property of the first button to
Database Connected, and change the Text property of the second button to
Connection Lost. Resize the buttons and the form so that you can see the captions.
7. Double-click the Database Connected button to open the code editor with the insertion
point in the button1_Click method. Add the following line of code to the method.
C#
CustomerApplication.API.CustomerApplicationAPI.GetInstance().RaiseDataba
seConnectionRestored("CustomerDatabase");

Visual Studio's IntelliSense feature will help you to enter the code quickly and easily.

8. Double-click the Connection Lost button to open the code editor with the insertion
point in the button2_Click method. Add the following line of code to the method.
C#
CustomerApplication.API.CustomerApplicationAPI.GetInstance().RaiseNoData
baseConnection("CustomerDatabase");

Notice that these events expect you to specify a parameter—the DatabaseName


parameter that you defined in the model. For details of the methods exposed by the
instrumentation helper classes, see Using the Generated Instrumentation Code in
Applications.

9. On the Visual Studio File menu, click Save All, and then close the code editor and Form1
designer windows.
10. In Management Model Explorer, right-click the top-level MyTestModel entry, and then
click Verify Instrumentation Coverage. You should see that the Visual Studio Error List
window now contains no errors or warnings because your code now invokes all the
abstract events defined in the model.

Figure 2 shows the completed test application in the Visual Studio designer.
Figure 2
The completed test application

You are now ready to run the application, but first you must configure it. The instrumentation
generation routines in the TSMMD create a configuration file that allows administrators to
specify the target environment and the granularity of the instrumentation. You must copy the
contents of this file into your application configuration file (App.config or Web.config) and edit
the contents before you run the application.
To configure and run the test application
1. In Solution Explorer, right-click the CustomerApplication project entry, point to Add,
and then click New Item.
2. In the Add New Item dialog box, select Application Configuration File, and then click
Add.
3. In Solution Explorer, double-click the InstrumentationConfiguration.config file (located
in the Instrumentation folder of the main solution) to open it into the editor. Select the
entire contents of the <configuration> element, excluding the opening and closing
<configuration> tags, and copy it into the App.config file between the opening and
closing <configuration> tags.

By default, the configuration settings you just added to the App.config file specify the
Default target environment for the CustomerApplication entity, with the
instrumentation level set to Coarse. These are the values you need. If you created
other target environments or specified different levels for instrumentation you
created, you would edit the values of the targetEnvironment and
instrumentationLevel attributes of the <add> element for each of the managed
entities in your model.

4. In Solution Explorer, right-click the CustomerApplication project entry, and then click
Set as Startup Project.
5. Press F5 to run the test application. Click the Connection Lost button to raise the
NoDatabaseConnection event, click the Database Connected button to raise the
DatabaseConnectionRestored event, and then close the test application.
6. In Control Panel, open Windows Event Viewer from the Settings item or Administrative
Tools item and view the contents of the Application log. You will see the two events
(with the Source set to MyTestModel_CustomerApplication) raised by the
instrumentation in the test application.

Notice that, although you raised the abstract events in your test application
(NoDatabaseConnection and DatabaseConnectionRestored), the settings in the application
configuration file specify that the instrumentation helpers should raise the concrete
implementations of these events that you mapped to the Default target environment (the Event
Log Events named NoDatabaseConnectionEvent and DatabaseConnectionRestoredEvent).

Generating an Operations Manager


2007 Management Pack
If you intend to monitor your application using Microsoft System Center Operations Manager
2007, you can use your management model to generate a management pack that you can install
into Operations Manager. The management pack will contain a template for a distributed
application; base classes and components for each managed entity in your model; and the
monitors and rules required to monitor the application.
To generate a management pack for System Center Operations Manager 2007
1. In Management Model Explorer, right-click the top-level MyTestModel entry, and then
click Generate Management Pack for OpsMgr 2007.
2. By default, the TSMMD generates a management pack named
Application.Operations.xml in a folder named ManagementPack within the same folder
as your project. Open this folder in Windows Explorer, and open the new management
pack in a text editor to view the contents or import it into System Center Operations
Manager to see the artifacts it contains.

To import a TSMMD-generated management pack into System Center Operations Manager,


you must also import the standard management packs upon which the generated
management pack depends.
The TSMMD can generate management packs automatically when you build an application, or
you can generate management packs from the command line. For more information see, the
documentation installed with the TSMMD.
Appendix C
Performance Counter Types
This appendix lists the performance counter types available in the .NET Framework 2.0. It is
reproduced directly from MSDN.

Counter type Description

AverageBase A base counter that is used in the calculation of time or count


averages, such as AverageTimer32 and AverageCount64. Stores the
denominator for calculating a counter to present "time per operation"
or "count per operation".

AverageCount64 An average counter that shows how many items are processed, on
average, during an operation. Counters of this type display a ratio of
the items processed to the number of operations completed. The ratio
is calculated by comparing the number of items processed during the
last interval to the number of operations completed during the last
interval.
Formula: (N 1 -N 0)/(B 1 -B 0), where N 1 and N 0 are performance
counter readings, and the B 1 and B 0 are their corresponding
AverageBase values. Thus, the numerator represents the numbers of
items processed during the sample interval, and the denominator
represents the number of operations completed during the sample
interval.
Counters of this type include PhysicalDisk\ Avg. Disk Bytes/Transfer.

AverageTimer32 An average counter that measures the time it takes, on average, to


complete a process or operation. Counters of this type display a ratio
of the total elapsed time of the sample interval to the number of
processes or operations completed during that time. This counter type
measures time in ticks of the system clock.
Formula: ((N 1 -N 0)/F)/(B 1 -B 0), where N 1 and N 0 are performance
counter readings, B 1 and B 0 are their corresponding AverageBase
values, and F is the number of ticks per second. The value of F is
factored into the equation so that the result can be displayed in
seconds. Thus, the numerator represents the numbers of ticks
counted during the last sample interval, F represents the frequency of
the ticks, and the denominator represents the number of operations
completed during the last sample interval.
Counters of this type include PhysicalDisk\ Avg. Disk sec/Transfer.

CounterDelta32 A difference counter that shows the change in the measured attribute
between the two most recent sample intervals.
Formula: N 1 -N 0, where N 1 and N 0 are performance counter
readings.

CounterDelta64 A difference counter that shows the change in the measured attribute
between the two most recent sample intervals. It is the same as the
CounterDelta32 counter type except that is uses larger fields to
accommodate larger values.
Formula: N 1 -N 0, where N 1 and N 0 are performance counter
readings.

CounterMultiBase A base counter that indicates the number of items sampled. It is used
as the denominator in the calculations to get an average among the
items sampled when taking timings of multiple, but similar items. Used
with CounterMultiTimer, CounterMultiTimerInverse,
CounterMultiTimer100Ns, and CounterMultiTimer100NsInverse.

CounterMultiTimer A percentage counter that displays the active time of one or more
components as a percentage of the total time of the sample interval.
Because the numerator records the active time of components
operating simultaneously, the resulting percentage can exceed 100
percent.
This counter is a multitimer. Multitimers collect data from more than
one instance of a component, such as a processor or disk. This
counter type differs from CounterMultiTimer100Ns in that it measures
time in units of ticks of the system performance timer, rather than in
100 nanosecond units.
Formula: ((N 1 - N 0) / (D 1 - D 0)) x 100 / B, where N 1 and N 0 are
performance counter readings, D 1 and D 0 are their corresponding
time readings in ticks of the system performance timer, and the
variable B denotes the base count for the monitored components
(using a base counter of type CounterMultiBase). Thus, the numerator
represents the portions of the sample interval during which the
monitored components were active, and the denominator represents
the total elapsed time of the sample interval.

CounterMultiTimer100Ns A percentage counter that shows the active time of one or more
components as a percentage of the total time of the sample interval. It
measures time in 100 nanosecond (ns) units.
This counter type is a multitimer. Multitimers are designed to monitor
more than one instance of a component, such as a processor or disk.
Formula: ((N 1 - N 0) / (D 1 - D 0)) x 100 / B, where N 1 and N 0 are
performance counter readings, D 1 and D 0 are their corresponding
time readings in 100-nanosecond units, and the variable B denotes the
base count for the monitored components (using a base counter of
type CounterMultiBase). Thus, the numerator represents the portions
of the sample interval during which the monitored components were
active, and the denominator represents the total elapsed time of the
sample interval.

CounterMultiTimer100NsInverse A percentage counter that shows the active time of one or more
components as a percentage of the total time of the sample interval.
Counters of this type measure time in 100 nanosecond (ns) units.
They derive the active time by measuring the time that the
components were not active and subtracting the result from multiplying
100 percent by the number of objects monitored.
This counter type is an inverse multitimer. Multitimers are designed to
monitor more than one instance of a component, such as a processor
or disk. Inverse counters measure the time that a component is not
active and derive its active time from the measurement of inactive time
Formula: (B - ((N 1 - N 0) / (D 1 - D 0))) x 100, where the denominator
represents the total elapsed time of the sample interval, the numerator
represents the time during the interval when monitored components
were inactive, and B represents the number of components being
monitored, using a base counter of type CounterMultiBase.

CounterMultiTimerInverse A percentage counter that shows the active time of one or more
components as a percentage of the total time of the sample interval. It
derives the active time by measuring the time that the components
were not active and subtracting the result from 100 percent by the
number of objects monitored.
This counter type is an inverse multitimer. Multitimers monitor more
than one instance of a component, such as a processor or disk.
Inverse counters measure the time that a component is not active and
derive its active time from that measurement.
This counter differs from CounterMultiTimer100NsInverse in that it
measures time in units of ticks of the system performance timer, rather
than in 100 nanosecond units.
Formula: (B- ((N 1 - N 0) / (D 1 - D 0))) x 100, where the denominator
represents the total elapsed time of the sample interval, the numerator
represents the time during the interval when monitored components
were inactive, and B represents the number of components being
monitored, using a base counter of type CounterMultiBase.

CounterTimer A percentage counter that shows the average time that a component
is active as a percentage of the total sample time.
Formula: (N 1 - N 0) / (D 1 - D 0), where N 1 and N 0 are performance
counter readings, and D 1 and D 0 are their corresponding time
readings. Thus, the numerator represents the portions of the sample
interval during which the monitored components were active, and the
denominator represents the total elapsed time of the sample interval.

CounterTimerInverse A percentage counter that displays the average percentage of active


time observed during sample interval. The value of these counters is
calculated by monitoring the percentage of time that the service was
inactive and then subtracting that value from 100 percent.
This is an inverse counter type. Inverse counters measure the time
that a component is note active and derive the active time from that
measurement. This counter type is the same as
CounterTimer100NsInv except that it measures time in units of ticks of
the system performance timer rather than in 100 nanosecond units.
Formula: (1- ((N 1 - N 0) / (D 1 - D 0))) x 100, where the numerator
represents the time during the interval when the monitored
components were inactive, and the denominator represents the total
elapsed time of the sample interval.

CountPerTimeInterval32 An average counter designed to monitor the average length of a


queue to a resource over time. It shows the difference between the
queue lengths observed during the last two sample intervals divided
by the duration of the interval. This type of counter is typically used to
track the number of items that are queued or waiting.
Formula: (N 1 - N 0) / (D 1 - D 0), where the numerator represents the
number of items in the queue and the denominator represents the time
elapsed during the last sample interval.

CountPerTimeInterval64 An average counter that monitors the average length of a queue to a


resource over time. Counters of this type display the difference
between the queue lengths observed during the last two sample
intervals, divided by the duration of the interval. This counter type is
the same as CountPerTimeInterval32 except that it uses larger fields
to accommodate larger values. This type of counter is typically used to
track a high-volume or very large number of items that are queued or
waiting.
Formula: (N 1 - N 0) / (D 1 - D 0), where the numerator represents the
number of items in a queue and the denominator represents the time
elapsed during the sample interval.

ElapsedTime A difference timer that shows the total time between when the
component or process started and the time when this value is
calculated.
Formula: (D 0 - N 0) / F, where D 0 represents the current time, N 0
represents the time the object was started, and F represents the
number of time units that elapse in one second. The value of F is
factored into the equation so that the result can be displayed in
seconds.
Counters of this type include System\ System Up Time.

NumberOfItems32 An instantaneous counter that shows the most recently observed


value. Used, for example, to maintain a simple count of items or
operations.
Formula: None. Does not display an average, but shows the raw data
as it is collected.
Counters of this type include Memory\Available Bytes.

NumberOfItems64 An instantaneous counter that shows the most recently observed


value. Used, for example, to maintain a simple count of a very large
number of items or operations. It is the same as NumberOfItems32
except that it uses larger fields to accommodate larger values.
Formula: None. Does not display an average, but shows the raw data
as it is collected.

NumberOfItemsHEX32 An instantaneous counter that shows the most recently observed


value in hexadecimal format. Used, for example, to maintain a simple
count of items or operations.
Formula: None. Does not display an average, but shows the raw data
as it is collected.

NumberOfItemsHEX64 An instantaneous counter that shows the most recently observed


value. Used, for example, to maintain a simple count of a very large
number of items or operations. It is the same as
NumberOfItemsHEX32 except that it uses larger fields to
accommodate larger values.
Formula: None. Does not display an average, but shows the raw data
as it is collected

RateOfCountsPerSecond32 A difference counter that shows the average number of operations


completed during each second of the sample interval. Counters of this
type measure time in ticks of the system clock.
Formula: (N 1 - N 0) / ((D 1 -D 0) / F), where N 1 and N 0 are
performance counter readings, D 1 and D 0 are their corresponding
time readings, and F represents the number of ticks per second. Thus,
the numerator represents the number of operations performed during
the last sample interval, the denominator represents the number of
ticks elapsed during the last sample interval, and F is the frequency of
the ticks. The value of F is factored into the equation so that the result
can be displayed in seconds.
Counters of this type include System\ File Read Operations/sec.

RateOfCountsPerSecond64 A difference counter that shows the average number of operations


completed during each second of the sample interval. Counters of this
type measure time in ticks of the system clock. This counter type is the
same as the RateOfCountsPerSecond32 type, but it uses larger fields
to accommodate larger values to track a high-volume number of items
or operations per second, such as a byte-transmission rate.
Formula: (N 1 - N 0) / ((D 1 -D 0) / F), where N 1 and N 0 are
performance counter readings, D 1 and D 0 are their corresponding
time readings, and F represents the number of ticks per second. Thus,
the numerator represents the number of operations performed during
the last sample interval, the denominator represents the number of
ticks elapsed during the last sample interval, and F is the frequency of
the ticks. The value of F is factored into the equation so that the result
can be displayed in seconds.
Counters of this type include System\ File Read Bytes/sec.

RawBase A base counter that stores the denominator of a counter that presents
a general arithmetic fraction. Check that this value is greater than zero
before using it as the denominator in a RawFraction value calculation.

RawFraction An instantaneous percentage counter that shows the ratio of a subset


to its set as a percentage. For example, it compares the number of
bytes in use on a disk to the total number of bytes on the disk.
Counters of this type display the current percentage only, not an
average over time.
Formula: (N 0 / D 0) x 100, where D 0 represents a measured attribute
(using a base counter of type RawBase) and N 0 represents one
component of that attribute.
Counters of this type include Paging File\% Usage Peak.

SampleBase A base counter that stores the number of sampling interrupts taken
and is used as a denominator in the sampling fraction. The sampling
fraction is the number of samples that were 1 (or true) for a sample
interrupt. Check that this value is greater than zero before using it as
the denominator in a calculation of SampleCounter or
SampleFraction.

SampleCounter An average counter that shows the average number of operations


completed in one second. When a counter of this type samples the
data, each sampling interrupt returns one or zero. The counter data is
the number of ones that were sampled. It measures time in units of
ticks of the system performance timer.
Formula: (N 1 N 0) / ((D 1 D 0) / F), where the numerator (N)
represents the number of operations completed, the denominator (D)
represents elapsed time in units of ticks of the system performance
timer, and F represents the number of ticks that elapse in one second.
F is factored into the equation so that the result can be displayed in
seconds.

SampleFraction A percentage counter that shows the average ratio of hits to all
operations during the last two sample intervals.
Formula: ((N 1 - N 0) / (D 1 - D 0)) x 100, where the numerator
represents the number of successful operations during the last sample
interval, and the denominator represents the change in the number of
all operations (of the type measured) completed during the sample
interval, using counters of type SampleBase.
Counters of this type include Cache\Pin Read Hits %.

Timer100Ns A percentage counter that shows the active time of a component as a


percentage of the total elapsed time of the sample interval. It
measures time in units of 100 nanoseconds (ns). Counters of this type
are designed to measure the activity of one component at a time.
Formula: (N 1 - N 0) / (D 1 - D 0) x 100, where the numerator represents
the portions of the sample interval during which the monitored
components were active, and the denominator represents the total
elapsed time of the sample interval.
Counters of this type include Processor\ % User Time.

Timer100NsInverse A percentage counter that shows the average percentage of active


time observed during the sample interval.
This is an inverse counter. Counters of this type calculate active time
by measuring the time that the service was inactive and then
subtracting the percentage of active time from 100 percent.
Formula: (1- ((N 1 - N 0) / (D 1 - D 0))) x 100, where the numerator
represents the time during the interval when the monitored
components were inactive, and the denominator represents the total
elapsed time of the sample interval.
Counters of this type include Processor\ % Processor Time.

You might also like