One thing Im asked on a regular basis when working with SQL Server Consulting clients that dont have much experience with performance tuning, is how to find what their most expensive queries are. Only, the problem is that the answer to that question can be a bit more complex than you might think. Happily though, Ive got a pretty decent answer to that question. But first, some background. See also, "Estimating Query Costs" and "Understanding Query Plans." SQL Server Uses a Cost-Based Optimizer While I dont have time to cover it in any detail here, one of the things that makes SQL Server special is that it boasts a cost-based optimizer. Meaning, that if you feed it a complex query, itll iterate over a number of different potential ways to execute your query against the storage engine in order to find a plan with the least expensive cost. And, to figure out cost, that means that SQL Server ends up having to know a decent amount about your data such as how many rows are in each table queried along with the distribution of unique values within those rows (or columns being filtered against) as well as an understanding of how likely it is that hits will be found for the joins or filters being specified. (Or in other words, if you fire off a SELECT * FROM dbo.SomeTable WHERE blah = yadayada; statement, SQL Server has (or will have) a pretty good idea of not only how many rows are in dbo.SomeTable, but itll also have a rough idea of how many of them have a blah column with a value equal to yadayada. For more information, I highly recommend taking a peek at this fantastic MCM presentation on Statistics from Kimberly L. Trip of SQLSkills.) Long story short though, as powerful and amazing as SQL Servers cost-based optimizer is (and, make no mistake, its part of SQL Servers secret recipe), one of the great things about SQL Server is that we can actually view the costs associated with particular operations simply by either highlighting the query in question within SQL Server Management Studio and pressing CTRL+L to have SQL Server either go and generate or fetch (from the cache) and then Display [the] Estimated Execution Plan for any query or operation, or by executing the query with the Include Actual Execution Plan option toggled as shown in the following screen capture:
A Very Rough Overview of Costs Then, once youre able to view an execution plan, one of the great things about it is that youre able to see the cost of not only the entire execution plan, but each individual operation that makes up the plan simply by mousing-over it as shown below:
And, again, a key thing to call out here is that these costs (estimated or otherwise) are based on SQL Servers knowledge of the size of your tables as well as the cardinality and distribution of your data. Or, in other words, these costs are based upon statistics about your data. Theyre not, therefore, something tangible like the number of milliseconds associated with an operation. As such, the best way to think of them is that lower numbers are better unless you want to try and get into some of the nitty-gritty details about how these costs are calculated (which, again, is proprietary information or part of SQL Servers secret sauce). With that said, theres still a way to frame these costs to provide an idea of what costs roughly mean in the real world. .003. Costs of .003 are about as optimized as youre going to get when interacting with the storage engine (executing some functions or operations can/will come in at cheaper costs, but Im talking here about full-blown data-retrieval operations). .03. Obviously, costs of .03 are a full order of magnitude greater than something with a cost of .003 but even these queries are typically going to be VERY efficient and quick executing in less than a second in the vast majority of cases. 1. Queries with a cost of 1 arent exactly ugly or painfull (necessarily) and will typically take a second or less to execute. Theyre not burning up lots of resources, but theyre also typically not as optimized as they could be (or they are optimized but theyre pulling back huge amounts of data or filtering against very large tables). 5. Queries with a cost greater than 5, by default, will be executed with a parallel plan meaning that SQL Server sees these queries as being large enough to throw multiple processors/cores/theads-of-execution at in order to speed up execution. And, if youve got a web site thats firing off a query with a cost of 5 or more per every page load, for example, youll probably notice that the page feels a bit sluggish loading maybe by a second or two as compared to a page that would spring up if it was running a query with a cost of, say, .2 or lower. So, in other words, queries up in this range start having a noticeable or appreciable cost. 20. Queries in this range are TYPICALLY going to be something you can notice taking a second or so. (Though, on decent hardware, they can still end up being instantaneous as well so even at this point, things still depend on a lot of factors). 200. Queries with this kind of cost should really only be for larger reports and infrequently executed operations. Or, they might be serious candidates for the use of additional tuning and tweaking (in terms of code and/or indexes). 1000. Queries up in this range are what DBAs start to lovingly call queries from hell though its possible to bump into queries with costs in the 10s of thousands or even more depending upon the operations being executed and the amount of data being poured over. And, in case its not obvious from some of the descriptions above, the thresholds Ive outlined above REALLY need to be taken with a grain of salt meaning that theyre just rough approximations to try and give these numbers a bit of context (for those that arent very experienced with performance tuning in general). The True Cost of Operations However, while taking a single query and comparing its cost in isolation is a great way to tune that operation to get better performance out of it (i.e., by adding/tuning indexes and/or tuning the code to make it better your GOAL is to decrease costs since LOWER costs are BETTER costs), it isnt a viable way to know what your most expensive query on a given server or within a given database is. For example, which query is truly an uglier query from a performance standpoint? That big/horrible/ugly report that management likes to run once a day at 7PM with a total cost of 1820.5? Or a single operation with a cost of .23 that gets called over 800,000 times in the same day? Typically a query with a cost of .23 wont really be scrutinized that much because its optimized enough. But if its called at highly repetitive rates, then that cost is incurred over and over and over again typically during periods of peak load on your server. And, in fact, if you multiply .23 * 800K, you end up with a total, aggregate, cost of 184,000 something that makes that nightly query look like childs play. As such, finding your most expensive queries is really a question of juggling execution costs against execution counts because its only when you consider both concerns that you start getting a sense for the TRUE costs of your most expensive operations. Querying SQL Server For Your Most Expensive Queries Accordingly, a while back I wrote a query that takes advantage of a couple things to be able to go in and actually ask SQL Server for a list of Top Worst Performing queries. To do this, I took advantage of the fact that SQL Servers query optimizer KEEPS execution plans in the cache once it generates a good plan for a query or operation thats been sent in to the server. I also took advantage of the fact that SQL Server keeps track of how many TIMES that execution plan gets used (or re-used) by subsequent executions as a means of determining execution counts. Then, I also took advantage of the fact that SQL Server exposes these execution plans to DBAs as XML documents that can then be parsed and reviewed to the point where you can actually use XPATH to interrogate an execution plan an 'extract the actual cost of a given operation. And with that, I was able to come up with a query that will scour SQL Servers plan cache, grab execution costs from each plan in the cache, and then multiply that number against the total number of times that the plan has been executed to generate a value that I call a Gross Cost or the total cost of each operation being fired over and over again on the server. And with that information, its then possible to easily rank operations by their true cost (as in execution cost * number of executions) to find some of your worst queries on your server. The code itself isnt too hard to follow and its patterned in many ways on some similar-ish queries that Jonathan Kehayias has made available where he too uses XPath/XQUERY to zip in and aggressively query full-blown execution plans within the plan cache: SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED ;
WITH XMLNAMESPACES (DEFAULT 'http://schemas.microsoft.com/sqlserver/2004/07/showplan'), core AS ( SELECT eqp.query_plan AS [QueryPlan], ecp.plan_handle [PlanHandle], q.[Text] AS [Statement], n.value('(@StatementOptmLevel)[1]', 'VARCHAR(25)') AS OptimizationLevel , ISNULL(CAST(n.value('(@StatementSubTreeCost)[1]', 'VARCHAR(128)') as float),0) AS SubTreeCost , ecp.usecounts [UseCounts], ecp.size_in_bytes [SizeInBytes] FROM sys.dm_exec_cached_plans AS ecp CROSS APPLY sys.dm_exec_query_plan(ecp.plan_handle) AS eqp CROSS APPLY sys.dm_exec_sql_text(ecp.plan_handle) AS q CROSS APPLY query_plan.nodes ('/ShowPlanXML/BatchSequence/Batch/Statements/StmtSimple') AS qn ( n ) )
SELECT TOP 100 QueryPlan, PlanHandle, [Statement], OptimizationLevel, SubTreeCost, UseCounts, SubTreeCost * UseCounts [GrossCost], SizeInBytes FROM core ORDER BY GrossCost DESC --SubTreeCost DESC Limitations of this Approach Of course, there ARE some limitations with the query Ive pasted above. First, its an AGGRESSIVE query typically weighing in with one of the WORST costs on many servers (i.e., it commonly shows up as a top 10 offender on servers that havent been running for very long or which dont have lots and lots of really huge performance problems.) And that, in turn, is because while SQL Server can do XML operations, they typically end up being VERY expensive. And, in this case, this query performs expensive XQUERY iterations over each and every plan in the cache meaning that it typically takes a LONG time to run. However, even despite how LONG this query will typically take to run (remember that it has to grab the raw cost of every plan in the cache before it can calculate a gross-cost based on total number of executions meaning that theres no way to 'filter out any particular plans), it wont block or cause problems while it runs. Another limitation of this approach is that it can only calculate gross-costs against accurate execution counts meaning that if you have expensive queries (or lots of tiny little queries called over and over and over again) that get kicked out of the cache, then the execution counts arent going to be as high as they really/truly are, and youll therefore suffer from lower gross costs and may, therefore, miss some of your worst performing queries.
But otherwise, the query listed above provides a fantastic way to quickly and easily be able to go out and query a SQL Server to get a list of your worst performing queries. Then, once you have them, youre free to analyze the execution plans in question (by simply clicking on the QueryPlan column in the results pane which should kick you out to a Graphical ShowPlan; if it doesnt Id recommend this post by Aaron Bertrand for a fix for a stupid bug that Microsoft refuses to address), and then, armed with information about how FREQUENTLY a particular operation is being called, you can then spend whatever energy and effort is necessary to try and tune that operation as needed in order to try and drive its total, overall (or gross) cost down.
--END OF CURRENT ARTICLE
SQL SERVER Find Most Expensive Queries Using DMV May 14, 2010 by pinaldave The title of this post is what I can express here for this quick blog post. I was asked in recent query tuning consultation project, if I can share my script which I use to figure out which is the most expensive queries are running on SQL Server. This script is very basic and very simple, there are many different versions are available online. This basic script does do the job which I expect to do find out the most expensive queries on SQL Server Box. SELECT TOP 10 SUBSTRING(qt.TEXT, (qs.statement_start_offset/2)+1, ((CASE qs.statement_end_offset WHEN -1 THEN DATALENGTH(qt.TEXT) ELSE qs.statement_end_offset END - qs.statement_start_offset)/2)+1), qs.execution_count, qs.total_logical_reads, qs.last_logical_reads, qs.total_logical_writes, qs.last_logical_writes, qs.total_worker_time, qs.last_worker_time, qs.total_elapsed_time/1000000 total_elapsed_time_in_S, qs.last_elapsed_time/1000000 last_elapsed_time_in_S, qs.last_execution_time, qp.query_plan FROM sys.dm_exec_query_stats qs CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) qt CROSS APPLY sys.dm_exec_query_plan(qs.plan_handle) qp ORDER BY qs.total_logical_reads DESC -- logical reads -- ORDER BY qs.total_logical_writes DESC -- logical writes -- ORDER BY qs.total_worker_time DESC -- CPU time You can change the ORDER BY clause to order this table with different parameters. I invite my reader to share their scripts.
Script to Identify worst Performing Queries Posted by Mahesh Gupta on September 3, 2011Leave a comment (1)Go to comments Being a DBA, to optimize a performance, a DBA need to know What are most most frequently executed queries on your system. What are the Queries / Statements which which makes system real busy. What are the top worst performed queries How much IO is being caused by a particular query What is the CPU processing time to execute a particular query What is frequency of executing these worst performing queries. We can find most of these information in DMV sys.dm_exec_query_stats, where we can rate SQL statements by their costs. These costs can be - AvgCPUTimeMiS = Average CPU execution time - AvgLogicalIo = Average logical operations or the total values of this measures. /*--------------------------------------------------------------------------------- ----------------------------------------------Description : This stored procedure will send out alert email if there is a blocking which lasted more than specified duration) -- Copyright 2011 - DBATAG
-- Author : DBATAG -- Created on : 09/01/2011 -- Modified on : 09/01/2011 -- Version : 1.0 -- Dependencies : -- Table Procedure Permissions -- No Dependencies No Dependencies View Server State Permissions Required ------------------------------------------------------------------- ---------------------------------------------------------*/ -- List expensive queries DECLARE @MinExecutions int; SET @MinExecutions = 5
SELECT EQS.total_worker_time AS TotalWorkerTime ,EQS.total_logical_reads + EQS.total_logical_writes AS TotalLogicalIO ,EQS.execution_count As ExeCnt ,EQS.last_execution_time AS LastUsage ,EQS.total_worker_time / EQS.execution_count as AvgCPUTimeMiS ,(EQS.total_logical_reads + EQS.total_logical_writes) / EQS.execution_count AS AvgLogicalIO ,DB.name AS DatabaseName ,SUBSTRING(EST.text ,1 + EQS.statement_start_offset / 2 ,(CASE WHEN EQS.statement_end_offset = -1 THEN LEN(convert(nvarchar(max), EST.text)) * 2 ELSE EQS.statement_end_offset END - EQS.statement_start_offset) / 2 ) AS SqlStatement -- Optional with Query plan; remove comment to show, but then the query takes !!much longer time!! --,EQP.[query_plan] AS [QueryPlan] FROM sys.dm_exec_query_stats AS EQS CROSS APPLY sys.dm_exec_sql_text(EQS.sql_handle) AS EST CROSS APPLY sys.dm_exec_query_plan(EQS.plan_handle) AS EQP LEFT JOIN sys.databases AS DB ON EST.dbid = DB.database_id WHERE EQS.execution_count > @MinExecutions AND EQS.last_execution_time > DATEDIFF(MONTH, -1, GETDATE()) ORDER BY AvgLogicalIo DESC ,AvgCPUTimeMiS DESC
OUTPUT Screenshot
Finding Most expensive queries in SQL server SQL server has provided a number of DMV's which could be used to find the resources consumed by different queries. This is very useful feature specially when you would like to find the queries which needs to be tuned. This can be quite useful for the DBA's which are proactive in finding the performance related issues.
I will use te sys.dm_exec_query_stats,sys.dm_exec_query_plan,sys.dm_exec_query_text DMV's to find the queries which are perfroaming badly on your systems.
Usage of query:
Based on this you can find Top 5,10 or 20 resource consuming queries like top 20 queries by logical reads or cpu reads or physical reads etc..
1. If your system is CPU starved.Try to use the ranking based on CPU time. 2. If you are more concerned with elapsed time.Try to use ranking based on elapsed time. 3. If your system has IO related issues try to use the ranking based on physical and Logical reads.
Now how to decide which one should be used Total or average. It depends on various factors.But most of times Total values you should use as these will give you clear pictures on which queries are either badly perforaming or are being executed a huge number of times. In Both cases this metric is better than average one.
However, there might be queries which will use the with recompile and these queries wont have the cummulative data and thus you would not get very clear picture on queries eating up your resources if you will use the Total metric.But these might appear in average metric and thus you should have a look on the queries based on average as well.
Now this wont give you a kind of cost which sql server provides while estimating the plans. Thus we can not say that a query which is taking 10 CPU seconds is more expensive than a query clocking 100000 logical IO's. Thus you have to find the expensive queries based on each Metric like CPU time,Elapsed Time,Logical reads etc..
Also, there are another tings to take into accounts.
1. The queries which are using hash and sort joins needs memory grant which is not part of these DMVs but it is in separte DMV sys.dm_exec_query_memory_grant. This is quite useful when you have memory related issues. Thus you might see low logical and physical reads and the query might not appear in your top 5,10,20 list but it is one of the main resource consuming query. Thus when looking for memory related issues.Please see the details in the memory_grant DMV. However, the cpu time will have the values for sort done etc..Thus in these cases using CPU time is a better choice.
with PerformanceMetrics
as
(
select
substring
(
dest.text,
statement_start_offset/2,
case when statement_end_offset = -1 then LEN(dest.text)
else statement_end_offset
end /2
) as 'Text of the SQL' ,
deqs.plan_generation_num 'Number of times the plan was generated for this SQL',
execution_count 'Total Number of Times the SQL was executed',
total_elapsed_time/1000 'Total Elapsed Time in ms consumed by this SQL',
Max_elapsed_time/1000 'Maximum Elapsed Time in ms consumed by this SQL',
min_elapsed_time/1000 'Minimum Elapsed Time in ms consumed by this SQL',
total_elapsed_time/1000*nullif(execution_count,0) 'Average Elapsed Time in ms consumed by this SQL',
total_worker_time 'Total CPU Time in ms consumed by this SQL',
Max_worker_time 'Maximum CPU Time in ms consumed by this SQL',
min_worker_time 'Minimum CPU Time in ms consumed by this SQL',
total_worker_time/nullif(execution_count,0) 'Average CPU Time in ms consumed by this SQL',
total_logical_reads 'Total Logical Reads Clocked by this SQL',
Max_logical_reads 'Maximum Logical Reads Clocked by this SQL',
min_logical_reads 'Minimum Logical Reads Clocked by this SQL',
total_logical_reads/nullif(execution_count,0) 'Average Logical Reads Clocked by this SQL',
total_physical_reads 'Total Physical Reads Clocked by this SQL',
Max_physical_reads 'Maximum Physical Reads Clocked by this SQL',
min_physical_reads 'Minimum Physical Reads Clocked by this SQL',
total_physical_reads/nullif(execution_count,0) 'Average Physical Reads Clocked by this SQL',
total_logical_writes 'Total Logical Writes Clocked by this SQL',
Max_logical_writes 'Maximum Logical Writes Clocked by this SQL',
min_logical_writes 'Minimum Logical Writes Clocked by this SQL',
total_logical_writes/nullif(execution_count,0) 'Average Logical Writes Clocked by this SQL',
deqp.query_plan 'Plan of Query',
DENSE_RANK() over(order by total_elapsed_time desc) 'Rank of the SQL by Total Elapsed Time',
DENSE_RANK() over(order by total_elapsed_time/nullif(execution_count,0) desc) 'Rank of the SQL by Average Elapsed Time',
DENSE_RANK() over(order by total_worker_time desc) 'Rank of the SQL by Total CPU Time',
DENSE_RANK() over(order by total_worker_time/nullif(execution_count,0) desc) 'Rank of the SQL by Average CPU Time',
DENSE_RANK() over(order by total_logical_reads desc) 'Rank of the SQL by Total Logical reads',
DENSE_RANK() over(order by total_logical_reads/nullif(execution_count,0) desc) 'Rank of the SQL by Average Logical reads',
DENSE_RANK() over(order by total_physical_reads desc) 'Rank of the SQL by Total Physical Reads',
DENSE_RANK() over(order by total_physical_reads/nullif(execution_count,0) desc) 'Rank of the SQL by Average Physical Reads',
DENSE_RANK() over(order by total_logical_writes desc) 'Rank of the SQL by Total Logical Writes',
DENSE_RANK() over(order by total_logical_writes/nullif(execution_count,0) desc) 'Rank of the SQL by Average Logical Writes',
DENSE_RANK() over(order by execution_count desc) 'Rank of the SQL by Total number of Executions'
--similarly you can add the ranks for maximum values as well.That is quite useful in finding some of the perf issues.
from
sys.dm_exec_query_stats deqs
/*F0C6560A-9AD1-448B-9521-05258EF7E3FA*/ --use a newid so that we could exclude this query from the performanc emetrics output
outer apply sys.dm_exec_query_plan(deqs.plan_handle) deqp --sometimes the plan might not be in the cache any longer.So using outer apply
outer apply sys.dm_exec_sql_text(deqs.sql_handle) dest --Sometimes the text is not returned by the dmv so use outer apply.
where
dest.text not like '%F0C6560A-9AD1-448B-9521-05258EF7E3FA%'
)
select
*
from
PerformanceMetrics
where
1=1
--apply any of these where clause in any combinations or one by one..
and [Rank of the SQL by Average CPU Time] <= 20
and [Rank of the SQL by Average Elapsed Time] <= 20
and [Rank of the SQL by Average Logical reads] <= 20
and [Rank of the SQL by Average Physical Reads] <= 20
and [Rank of the SQL by Total CPU Time] <= 20
and [Rank of the SQL by Total Elapsed Time] <= 20
and [Rank of the SQL by Total Logical reads] <= 20
and [Rank of the SQL by Total Physical Reads] <= 20
and [Rank of the SQL by Total number of Executions] <= 20
and [Rank of the SQL by Average Logical Writes] <= 20
and [Rank of the SQL by Total Logical Writes] <= 20