You are on page 1of 8

Understanding AWR

What is AWR?
Automatic Workload Repository (AWR) is a collection of persistent system performance
statistics owned by the SYS user. It resides in SYSAUX tablespace. By default snapshot are
generated once every 60min and maintained for 7 days.AWR report are used to investigate
performance and other issues.

AWR collects data in the following categories:


Base Statistics general database performance metrics since instance start-up.
SQL statistics for each executed SQL statement (# executions, # physical reads, etc)
Deltas the rate of change of important stats over time. Similar to our collection technologies
that do before and after snapshots and only show the deltas over the specified period of time.
Expert Advice results of the expert analysis engine provided in 10g.
As there is report that there is a performance issue, our first concern is what the database is
waiting for.
When processes wait, they are being prevented/hold from doing an activity because of some
other factor. High waits provide the great benefit when wait times are decreased and as such
are a good focus.
The Top Wait information provides such info and allows us to center on the main problem areas
without wasting time in research areas that are not causing significant delay.

Top 5 Timed Events but here we have Top 10 explained

1. DB File Scattered Read. This generally indicates waits related to full table scans.
As full table scans are pulled into memory, they rarely fall into contiguous buffers but
instead are scattered throughout the buffer cache.
2. DB File Sequential Read. This event generally indicates a single block read (an
index read, for example). A large number of waits here could indicate poor joining orders
of tables, or unselective indexing. It is normal for this number to be large for a hightransaction, well-tuned system, but it can indicate problems in some circumstances.
3. Free Buffer. This indicates your system is waiting for a buffer in memory, because
none is currently available. Waits in this category may indicate that you need to increase
the DB_BUFFER_CACHE, if all your SQL is tuned. Free buffer waits could also indicate
that unselective SQL is causing data to flood the buffer cache with index blocks, leaving
none for this particular statement that is waiting for the system to process.
4. Buffer Busy. This is a wait for a buffer that is being used in an unshareable way or is
being read into the buffer cache. Buffer busy waits should not be greater than 1 percent.
Check the Buffer Wait Statistics section (or V$WAITSTAT) to find out if the wait is on a
segment header. If this is the case, increase the freelist groups or increase the pctused
to pctfree gap.
5. Latch Free. Latches are low-level queuing mechanisms (they're accurately referred to
as mutual exclusion mechanisms) used to protect shared memory structures in the
system global area (SGA). Latches are like locks on memory that are very quickly
obtained and released. Latches are used to prevent concurrent access to a shared
memory structure. If the latch is not available, a latch free miss is recorded. Most latch

Understanding AWR
problems are related to the failure to use bind variables (library cache latch), redo
generation issues (redo allocation latch), buffer cache contention issues (cache buffers
LRU chain), and hot blocks in the buffer cache (cache buffers chain).
6. Enqueue. An enqueue is a lock that protects a shared resource. Locks protect shared
resources, such as data in a record, to prevent two people from updating the same data
at the same time. An enqueue includes a queuing mechanism, which is FIFO (first in,
first out). Note that Oracle's latching mechanism is not FIFO. Enqueue waits usually
point to the ST enqueue, the HW enqueue, the TX4 enqueue, and the TM enqueue.
7. Log Buffer Space. This wait occurs because you are writing the log buffer faster than
LGWR can write it to the redo logs, or because log switches are too slow. To address
this problem, increase the size of the log files, or increase the size of the log buffer, or
get faster disks to write to. You might even consider using solid-state disks, for their high
speed.
8. Log File Switch. All commit requests are waiting for "logfile switch (archiving
needed)" or "logfile switch (chkpt. Incomplete)." Ensure that the archive disk is not full or
slow. DBWR may be too slow because of I/O. You may need to add more or larger redo
logs, and you may potentially need to add database writers if the DBWR is the problem.
9. Log File Sync. When a user commits or rolls back data, the LGWR flushes the
session's redo from the log buffer to the redo logs. The log file sync process must wait
for this to successfully complete. To reduce wait events here, try to commit more records
(try to commit a batch of 50 instead of one at a time, for example). Put redo logs on a
faster disk, or alternate redo logs on different physical disks, to reduce the archiving
effect on LGWR. Don't use RAID 5, since it is very slow for applications that write a lot;
potentially consider using file system direct I/O or raw devices, which are very fast at
writing information.
10. Idle Event. There are several idle wait events listed after the output; you can ignore
them. Idle events are generally listed at the bottom of each section and include such
things as SQL*Net message to/from client and other background-related timings. Idle
events are listed in the stats$idle_event table.

SQL Statistics

AWR Reports show a number of different SQL statistics:


SQL Ordered by Elapsed Time: Includes SQL statements that took significant execution time
during processing.

Understanding AWR
As explained by name itself, this lists SQL queries ordered by Elapsed time into reported time
interval. Look for query has low executions and high Elapsed time per Exec (s) and this query
could be a candidate for troubleshooting or optimizations. In above report, you can see first
query has maximum Elapsed time but no execution. So you have to investigate this.
In Important point, if executions is 0, it doesn't means query is not executing, this might be the
case when query was still executing and you took AWR report. That's why query completion was
not covered in Report.
SQL Ordered by CPU Time: Includes SQL statements that consumed significant CPU time
during its processing.SQL queries are listed on the basis of CPU taken by the query i.e. queries
causing high load on the system. The top few queries could be the candidate query for
optimization.
Look for queries using highest CPU Times, If a query shows executions 0, this doesn't means
query is not executing. It might be same case as in SQL queries ordered by Elapsed time. The
query is still executing and you have taken the snapshot.
However, There are so many other stats in AWR Report which a DBA needs to consider, I have
listed only ten of them but these are the most commonly used stats for any performance related
information.
SQL Ordered by Gets: These SQLs performed a high number of logical reads while retrieving
data.
SQL Ordered by Reads: These SQLs performed a high number of physical disk reads while
retrieving data.
SQL Ordered by Parse Calls: These SQLs experienced a high number of reparsing
operations.
SQL Ordered by Sharable Memory: Includes SQL statements cursors which consumed a
large amount of SGA shared pool memory.
SQL Ordered by Version Count: These SQLs have a large number of versions in shared pool
for some reason.

Load Profile

Dependent on the waits, the load profile section either provides useful general background
information or specific details related to potential issues
The AWR report "Load profile" section, as shown below, contains a lot of very useful, but often
overlooked information. Usually prefer to use the "instance efficiency percentages"
festival, although good readability, but very easy to misunderstand.
Load Profile. There are two primary reasons for looking at load profile in the beginning.

Understanding AWR
1. If you are comparing one AWR report with another, then load profile will help you
understand that you have executed exact same/ similar test on your application [and
there is no unexpected activity impacted the database performance during that period.
Example database backup job]. I am not saying the number will be exactly same but
they should be in an expected range. If that is not the case, then you must find out the
changes implemented either in the test you are running or in the system (application or
database).
2. To analyse using the performance statistics.
Per Second

Per Transaction

Per Exec

Per Call

DB Time(s):

0.1

0.2

0.01

0.08

DB CPU(s):

0.0

0.1

0.01

0.05

Redo size:

2,650.2

9,391.1

Logical reads:

863.7

3,060.4

Block changes:

31.6

112.1

Physical reads:

220.2

780.2

Physical writes:

0.7

2.3

User calls:

0.9

3.1

Parses:

2.4

8.4

Hard parses:

0.2

0.8

W/A MB processed:

356,098.6

1,261,849.5

Logons:

0.0

0.2

Executes:

5.3

18.9

Rollbacks:

0.0

0.0

Transactions:

0.3

1. Redo size: The amount of redo generated during this report.


2. Logical Reads: Calculated as (Consistent Gets XE "Consistent Gets" + DB Block Gets XE
"DB Block Gets" = Logical Reads).
Block changes: The number of blocks modified during the sample interval.
Physical Reads: The number of requests for a block that caused a physical I/O operation.
3. Physical Writes: Number of physical writes performed.
User Calls: Number of user queries generated.
4. Parses: The total of all parses; both hard and soft.
5. Hard Parses: The parses requiring a completely new parse of the SQL statement. These

Understanding AWR
consume both latches and shared pool area.
6. Soft Parses: Soft parses are not listed but derived by subtracting the hard parses from
parses. A soft parse reuses a previous hard parse; hence it consumes far fewer resources.
7. Sorts, Logons, Executes and Transactions: All self-explanatory.
Workload section:

1. % Blocks changed per Read:


The % Blocks changed per Read statistic indicates all blocks are retrieved for update.
Blocks Changed per Read % = (Block Changes + 100/ Logical Reads)
2. Recursive Call %:
Sometimes, in order to execute a SQL statement issued by a user, Oracle must issue additional
statements. Such statements are called recursive calls or recursive SQL statements. For
example, if you insert a row into a table that does not have enough space to hold that row, then
Oracle makes recursive calls to allocate the space dynamically. Recursive calls are also
generated when data dictionary information is not available in the data dictionary cache and
must be retrieved from disk.

Instance Efficiency

Instance efficiency stats are more use for general tuning as opposed to addressing specific
issues
Instance Efficiency Percentages is most useful as a start point:

Buffer Nowait % ratio that shows whether server processes had to wait for buffers. Low value
indicates that there are some wait events related to buffer cache.
Buffer Hit % ratio that show portion of requsted blocks where read from the buffer cache. It is
recommended that this ration must be tuned up to 100%. But one must take in account that
even if it is 100% this does not mean that system is healthy.
Library Hit % ration that refers to library cache efficiency and shows how often requested
SQL was found in Library cache. 100 % means that where were no hard parses at all. (system
in example have been working for a few days so all possible sqls were parsed and stored in
library cache). Low value may indicate too small library cache, sql not using bind variables,
frequent invalidations of referred objects or just that system was just started.

Understanding AWR
Execute to Parse % ration that shows number of executions to number of parses. Target is
100%.
Parse CPU to Parse Elapsd% show how much time was spent on waits for latches during
parsing. Target is 100%(no waits for latches). Parse Elapsd is sum of Parse CPU and Waits, so
if Waits are low then this ration will be high and vise versa.
Redo NoWait% ration shows percentage of times redo was available without waits. Target is
100%. If ratio is low then probably there is some problem with redo logs.
In-memory Sort % ration that refers to Sort Area efficiency. Low value is a signal that Sort
Area is probably undersized for the workload. Target is 100% that means that all sort were
processed in memory.
Soft Parse % Soft Parses to Hard Parses. Hard parses are expensive operations and
consume a lot of CPU time. Target is 100% all sql is in the cache and no Hard Parse occurred.
low value indicated problem in Library Cache.
Latch Hit % ration shows percentage of times Latches were available without waits. Target is
100%. Low value indicates problem with latches.
% Non-Parse CPU shows percentage of time that was spent on execution of SQL. Low value
indicated problem with parsing of SQL Oracle uses more CPU for parsing.
These rations could help to investigate problems occurred during the snap period.

Latch Activity

However, if latch waits were significant, then we would be looking for high latch sleeps under
Latch Sleep Breakdown for latch free waits:
Here the top latch is cache buffers chains. Cache Buffers Chains latches protect the buffers in
the buffer cache that hold data that we have retrieved from disk. This is a perfectly normal latch
to see when data is being read. When this becomes stressed, the sleeps figure tends to rise as
sessions start to wait to get the buffers they require. Contention can be caused by poorly tuned
SQL reading the same buffers.
Latch information is provided in the following three sections:
. Latch Activity
. Latch Sleep breakdown
. Latch Miss Sources
This information should be checked whenever the "latch free" wait event or other latch wait
events experience long waits. This section is particularly useful for determining latch contention
on an instance. Latch contention generally indicates resource contention and supports
indications of it in other sections. Latch contention is indicated by a Pct Miss of greater than
1.0% or a relatively high value in Avg Sleeps/Miss. While each latch can indicate contention on
some resource, the more common latches to watch are:

Understanding AWR
cache buffer chain= The cache buffer chain latch protects the hash chain of cache buffers, and
is used for each access to cache buffers.
shared pool= The shared pool latch is heavily used during parsing, in particular during hard
parse.
library cache= The library cache latch is heavily used during both hard and soft parsing.
row cache= The row cache latch protects the data dictionary information, such as information
about tables and columns.
cache buffer lru chain= The buffer cache has a set of chains of LRU block, each protected by
one of these latches.
Notable timed and wait events:

CPU time events


Just because CPU comes as top timed event in AWR may not indicate a problem.
However, if performance is slow with high CPU usage, then start investigating the wait.
First, check to see if a sql is taking most CPU under SQL ordered by CPU Time in AWR:

Once you have identified the SQL statements that are using the highest CPU, investigate the
reason for this usage.
Look at the number of executions and see whether that is appropriate for this statement.
Excessive executions might indicate that the statement is being called too frequently and
it might be possible to execute it for a group of rows rather than row by row (i.e. execute
it in a batch).
Is the amount of CPU per execution excessive - this might indicate that the statement
itself is inefficient.

Additionally, look at the other SQL Statistics in the AWR report to see if the SQLID(s) in
question show excessive values for any of those, then deal with the statement
appropriately.

CPU related Issues:

Check to see if other waits follow the high CPU timed event.
High External CPU usage
Troubleshooting CPU usage

'Log file sync' waits


When a user session commits or rolls back, the log writer flushes the redo from log buffer to the
redo logs. AWR reports are very useful for determination if this is a problem and whether the
cause of the problem is I/O or in some other area.

Understanding AWR
Buffer busy waits
This is the event waited on when a session is trying to get a buffer from the buffer cache but the
buffer is busy - either being read by another session or another session is holding it in
incompatible mode.
Waits for 'Cursor: mutex/pin'
If there are mutex waits such such as 'Cursor: pin S wait on X' or 'Cursor: mutex X' etc , then
these are indicative of parsing issues. On this basis look for statements with high parse counts
or high version counts under 'SQL ordered by Parse Calls' and 'SQL ordered by Version Count'
as these are most likely to be the causes of problems.
Use of ADDM Reports alongside AWR
ADDM reports can be reviewed along with AWR to assist in diagnosis since they provide
specific recommendations which can help point at potential problems.

You might also like