You are on page 1of 37

Detailed Analysis of Monitoring Process

Standard System Checks


C

1.1 Check that All Application Servers are Up and Work Process Analysis - SM51

Transaction SM51 allows you to look at all the servers in your system (for example, the
PRD database server and all of its application servers). You do not have to log into each
server individually.

If one of your dialog application servers is not up, the users who normally log on to
that application server will not have a server to log on to.
If the batch application server is down, batch jobs that are specified to run on that
server will not run.

1. In the Command field, enter transaction SM51 and choose Enter (or choose
ToolsAdministration, then MonitorSystem monitoringServers).
2. Review the list of instances under Server name. Verify that all your instances are
listed. If it is listed, it is up and running.

2
Process overview transactions allow users to view the status of work processes and
monitor for problems. Transaction SM51 is a central transaction from which you can
select the instance to monitor. SM51 starts transaction SM50 for each application server.
Transaction SM50 is used for a system without application servers.
Transaction SM51 is one place to look for jobs or programs that may be hung, which
would be indicated by long run times. If batch jobs are not running, transaction SM50 may
provide a hint of the problem, if all the batch work processes are in use.
1. In the Command field, enter transaction SM51 and choose Enter (or choose
ToolsAdministration, then MonitorSystem monitoringServers).
2. Select the instance you want to view.
3. Choose Processes.
4. This is the Process Overview transaction (SM50) for that instance.

Some of the column definitions are:


Column Text
Definitions
No
Work process number
Ty
Type of work process
PID
OS PID (Process ID) number
Status
Current status of the work process
Err
Number of detected errors in the work process
CPU
Cumulative CPU time that the current process is taking
Time
Cumulative wall time that the current process is taking
Program
Name of the ABAP program
Clie
Client number
User
User ID that is using the work process
Table
Table that the action is being performed on

1.2

SM04 (USER OVERVIEW)


Menu Path:
Tools/Admin/Monitoring/System/Users
Note: Unfamiliar user ids, addresses unresolved to names, what clients are logged in to,
unusual number of sessions open and ensure the tracing is turned off.

1.3

SM66 (Global Process Monitor)


Menu Path:
Tools/Administration/Computing Center/Management System/Control/All work Processes
Provides overview of work process activity in the system. Helps to identify long running
programs as well as the reason for slow performance.
Important Statistics:
Block work processes (Time>5Sec)
Long Database Operations (Ex: SAPLGLIV, SAPLNRIV) running for long time.
Many Load Operations
Work processes in PRIV mode
Work processes with high number of restarts

1.4

ST07 (Application Monitor)


It is the responsibility of the BASIS administrator to monitor the number of users on a
particular application server, based on this we can configure logon groups (SMLG).

1.5

Locks, Transaction SM12


A lock is a mechanism that prevents other users from changing the record on which you
are working. There may be old locks still in place from transactions that did not release,
or from when the user was cut off from the network. Unless cleared, these locks prevent
access or change to the record until the system is cycled. The easiest way to locate them is
to look for locks from prior days.
Important: The profile parameter rdisp/gui_auto_logout should be set. This parameter
defines an automatic logout of the user if there is no activity for the set number of
minutes.
1. In the Command field, enter transaction SM12 and choose Enter (or choose
ToolsAdministration, then MonitorLock entries).
2. Enter * in Client.

3. Clear the User name field.


4. Choose Enter.
5. Look for locks from previous days in the Time column.
The presence of a lock from a previous day could mean that the user was
disconnected from the network and the R/3 System.

To clear a lock, complete these steps on the systems application and the database servers:
1. Check that the user is not logged on any of the servers with transaction SM04 (no
application servers) or AL08 (with application servers).
If the user is not on the system, but transaction SM04 shows them on the system,
delete their sessions. This step, by itself, may clear the lock.
2. Check that there are no processes running under the user ID using transaction SM50 or
SM51.
3. Check that there are no batch jobs running under the user ID using transaction SM37.
4. Check that there are no updates in process for that user ID using transaction SM13.
5. Once you know that there is no activity using the users ID, select the lock entry for
deletion.
6. Choose Lock entriesDelete.
1.6

Failed Updates, Transaction SM13

A failed update or an update terminate is an update to the database that failed. These
failed updates occur when a user entry or transaction is not entered or updated in the
database.
You should check the system for failed updates several times a day.
The longer you wait after the update terminate has occurred, the more difficult it is for
users to remember what they did when the update terminate occurred.
1. In the Command field, enter transaction SM13 and choose Enter (or choose
ToolsAdministration, then MonitorUpdate).
2. Enter * in Client.
3. Enter * in User.
4. Under Status, select All.
5. Change the date to a year ago in From date.

6. Look for entries with an Err in the Status column.


These entries are failed updates or update terminates. You may also see other entries listed
without the Err status. If you have no failed updates, you may stop here. If you do have
failed updates, continue.

7. Double-click on the entry with an Err status.


8. Choose ABAP short dump.
If a short dump exists, it will appear.
Some problems that can occur with an update terminate include:
No short dump
In this case, the only clues you have are the:
User ID
Date
Time
Transaction
Difficulty reading the short dump
The ability to read a short dump comes with experience and practice. Only
some of a short dumps content is useful to the developer.
Short dump with little usable information

Update terminate occurring downstream from the actual transaction


The data in the short dump may be of little value in finding the root source of
the update terminate (for example, if the terminate occurred in the FI posting
of an SD transaction, you do not know which SD transaction document caused
the problem).
Update terminate occuring in a batch job
There is no indication of which batch job (by job name) caused the update
terminate.
9. The users need to be contacted. They should check for the missing entry and
reprocess the missing transaction. Do not attempt to reapply the failed update.
1.7

System Log, Transaction SM21


The system log is the R/3 Systems log of events, errors, problems, and other system
messages.
1. In the Command field, enter transaction SM21 and choose Enter (or choose
ToolsAdministration, then MonitoringSystem log).
2. Enter the beginning date and time that you want to review from the log in from
date/time. (You can also enter an end date and time if you want to view a specific time
period.)
3. Choose Reread system log. (If you would like to see the log from all application
servers, goto System logChooseCentral system log, then choose Reread system
log.)

These options allow you to view the system log in the following priority layers:
Problems
Problems and warnings
All messages
2
3

What to look for:


Unusual entries
Before you can recognize the unusual entries, you will need to become familiar with
which entries are in the log under normal conditions (for your installation for a
specific system).
Column C for the error status
Errors are the red Ks and warnings are the yellow Ws. These entries can also be
examined with the Alert Monitor (RZ20).
Double click on any entry in the log to gain additional information.

1.8

Batch Input Jobs, In-Error or To Be Processed, Transaction SM35


This transaction shows jobs that need to be processed or started, and jobs with errors that
need to be resolved.
This transaction is important because it alerts you to batch input jobs that:
Need to be processed
These are jobs that are waiting to be processed. If not processed, the data will not post
to the system.
Are in error
These are jobs that have failed due to an error. The danger is that only a portion of the
job may have posted to the system. This increases the potential for data corruption of a
different sort, as only part of the data is in the system.
1. In the Command field, enter transaction SM35 and choose Enter (or choose
SystemServicesBatch inputEdit).
2. Enter a start date of at least a week ago (or even further back if needed) in Creation
date from.
3. Under Session status, select both:
To be processed

Incorrect
These selections display only the batch jobs that need to be processed and those
with errors that need to be resolved.
4. Choose Enter.
5. Contact the responsible user to notify them or determine why these jobs are in:
a. Sessions still to be processed
b. Errors in sessions

1.9

Background Jobs, Transaction SM37


Background jobs are batch jobs scheduled to run at specific times during the day.
If you are running critical jobs, you need to know if the job failed because there may be
other processes, activities, or tasks that are dependent on these jobs.
You should have a list of all the critical jobs scheduled to run. For each of these jobs, you
should have a list that shows:
When they are scheduled to run
The expected run time
Emergency contact (names and phone numbers) if a job fails or has problems

Restart or problem procedure for the job

1. In the Command field, enter transaction SM37 and choose Enter (or choose
SystemService, then JobsJob overview).
2. Enter * to get all jobs in Job name.
3. Enter either * (for all users) or the user ID that the batch jobs run under (to limit the
display to those scheduled under a specific user ID in User name).
4. Enter a start date in From.
5. Enter an end date in To.
6. Under Only jobs with status, select:
Active
Finished
Terminated
7. Choose Enter.
8. Check for failed or cancelled jobs.
Analyze why jobs failed or were cancelled and make the necessary corrections.
9. Check critical jobs.
You need to know the job name that they run under to do this.

To check a job log:


10. Select the job.
11. Then choose Job log.
Check job performance and record run times. A deviation from the usual run time on a
job may indicate a problem and should be investigated.
811

10
1.10

Transaction DB02
This task involves monitoring the growth of the database and projecting future growth to
Determine when to plan to expand the database.
1. In the Command field, enter transaction DB02 and choose Enter (or choose
ToolsCCMS, then Control/MonitoringPerformance Menu, and then
DatabaseTable/Indexes).
2. Choose Space statistics under Database system
3. Check %-Used
Drill down on Tablespaces or Tables and indexes if issues are discovered.

A
1.11

Check the Spool, Transaction SP01


The spool is the R/3 Systems output manager. Data sent to the printer is sent to the R/3
spool and then sent to the operating system to print.
There may be problems with the printer at the operating system level. These problems
need to be resolved immediately for time-critical print jobs. Check for active spool jobs
that have been running for over an hour. These long-running jobs could indicate a problem
with the operating system spool or the printer.
1. In the Command field, enter transaction SP01 and choose Enter (or choose
SystemServicesOutput controller).
2. Clear User name.
3. Set the From date to a week ago.
4. Clear Client.
5. Choose Enter.
6. Look for jobs with an error in the Output Status column.

1.12

Printing / Spool System, Transaction SPAD


The spool administration screen can be used to identify problems and errors with printing
processes.
1. In the Command field, enter transaction SPAD and choose Enter (or choose
ToolsCCMSSpoolSpool administration).
2. At the bottom of the screen, choose Print request overview.
3. Check the number of Total print requests
This number should never be more than 99,000
When the spool is this large, performance issues begin to develop

1.13

ABAP Dump Analysis, Transaction ST22


An ABAP dump (also known as a short dump) is generated when a report or transaction
terminates as the result of a serious error. The system records the error in the system log
(transaction SM21) and writes a snapshot (dump) of the program termination to a special
table.
You use an ABAP dump to analyze and determine why the error occurred, and take
Corrective action.
1. In the Command field, enter transaction ST22
There are two selection methods to display the list of dumps.
For simple selection; today or yesterday (go to step 2)
For free selection (go to step 5)
2. Under No. of short dumps, if there is a value other than zero (0) in Today or Yesterday,
dumps have occurred that need to be examined.
3. Select Today.
4. Choose Display list to get a list of short dumps for the day. Go to step 8.
5. Choose Selection.
6. Enter your selection criteria in the ABAP Dump Analysis screen.
7. Choose Execute.
8. Double-click on the dump you want to analyze.
9. This screen shows the short dump.

Performance Checks
2.1

CCMS Central Alert Monitor, Transaction RZ20


Transaction RZ20 is a centralized alert monitor and is new with Release 4.0. With this
transaction, you can monitor the servers in your landscape, such as development, QA,
testing, production, and so on. You no longer have to individually log into each system to
search for alerts

An alert indicates a potentially serious problem that should be quickly resolved. If not
contained, these problems could degenerate into a disaster.
1. In the Command field, enter transaction RZ20 and choose Enter (or choose
ToolsCCMS, then Control/MonitoringAlert Monitor 4.0).
2. Click the node next to the server to expand the server options.
3. Double-click on the monitor.
4. Look for any alerts, which are indicated in red.
5. Click the node next to the <sid> to drill down for additional details.
6. Select the alert
7. Choose Display alerts.
To get detail for the alert:
8. Select the alert item.
9. Choose Choose detail (magnifying glass icon).
10. Review the details.
11. Choose Back
To acknowledge the alert:
12. Select the alert item(s).
Alert (red)
Warning (yellow)
13. Choose Complete alerts.
You still have to perform a task based upon the alert.
Acknowledging the alert only means you received the alert notification,
nothing else.
14. When all alerts and warnings are acknowledged, the alert will change to green.
To get more details on the alert, you can start the analysis tool.
From the previous screen:
1. Select the alert (for example, Page_Out)
2. Choose Start Analysis tool.

2.2

Transaction ST02
The buffer tune summary transaction displays the R/3 buffer performance statistics. It is
used to tune buffer parameters of R/3 and, to a lesser degree, the R/3 database and
operating system.
The buffer is important because significant buffer swapping reduces performance. Look
under Swaps for red entries.
1. In the Command field, enter transaction ST02 and choose Enter The two important
things to review are:
Hit Ratio, for which the target value is 95 percent and higher
Soon after starting the system, this value is typically low, because certain buffers are
empty. The hit ratio will increase as the system is used and the buffers are loaded. It
usually takes a day to load the buffers that are normally used.
Swaps, for which the target value is less than 1,000
Swaps occur when the necessary data is not in the buffer. The system has to retrieve
the data from the database. The swap value is reset to zero (0) when the system is
restarted.
Buffer swaps may be due to:
Buffer too small, out of space
Out of buffer directory entries
Fragmentation in buffer, particularly the program buffer
If program buffer exceeds 10,000, it should be investigated
Generic table buffer
Sometimes very large buffered objects should be unbuffered (over 2MB)

2.3

Workload Analysis of the System, Transaction ST03


Workload analysis is used to determine system performance.
Statistics should be checked, and trends should be recorded to get a feel for the systems
behavior and performance. Understanding the system when it is running well will help
determine what changes need to be made when it is not.
1. In the Command field, enter transaction ST03 and choose Enter (or choose
ToolsAdministration, then MonitorPerformance, then WorkloadAnalysis).
2. To check Response times
A. Choose Detailed analysis menu from toolbar.
B. Choose One recent period from Performance history, Global.
C. Choose a time period.
Today
Previous days
This week
Previous weeks
This month
Previous months
D. Select date.

E. Choose Dialog at the bottom of the screen

F. Choose Transaction profile from the toolbar


Main menu dialog step average time should be very short
If it is not, this could indicate a dialog instance bottleneck
Key dialog transaction step time components
Wait time should be very short
DB time should be much less than CPU time
3. To compare servers
A. Choose Detailed analysis menu from toolbar
B. Choose Compare all servers from Performance history, Global.
This indicates average performance across all instances
If there is a performance problem specific to an instance, Avg wait time and
CPU time will be the best indicators.
2.4

DB Performance Analysis, Transaction ST04


The database error log is the record of database-level errors.
Database error logs may indicate a database problem that is not reported in other
locations.
1. In the Command field, enter transaction ST04
A. Choose Goto Database log.
B. In the Database Messages window, select Only alerts.
This selection reduces the amount of text to look through in the first pass.
During a second pass through this transaction, select All messages.
C. Choose Display.
D. Scroll down the log to check for error messages.

2. Check the database statistics


A. Oracle data buffer checks
If there are more than a few million reads, the buffer quality should be >
98%
< 98% could indicate buffer too small or expensive SQL statements
Buffer waits should be < 5% of reads
> 5% could indicate I/O bottleneck
B. Oracle call statistics checks
Reads per user call should be < 40
> 40 could indicate expensive SQL
User calls / recursive calls should be > 5
< 5 could indicate that the Shared Pool may be too small
C. Oracle shared Pool statistics checks
DD cache quality should be > 90%
< 90% could indicate that the Shared Pool is too small
SQL Area getratio, pinratio should be > 90%
< 90% could indicate that the Shared Pool is too small
3. Check database SQL
A. Go to Detailed analysis menu on tool bar, then to Resource consumption by,
SQL request, sort by Buffer Gets
B. Look for statements with:
More than 1,000-10,000 budgets / record (most optimization potential)
Buffer gets / total reads since start in more than 5% of the total
More than a few executions
Larger fraction of reads from disk than other statements
Double click on line to get more information
Once displayed, the Explain option can be selected for further
analysis

2.5

Local OS Monitor, Transaction ST06


The local OS monitor receives information from the OS collector.
1. In the Command field, enter transaction ST06 and choose Enter (or choose
ToolsCCMS, then Control/MonitoringPerformance Menu, then Operating
systemLocalActivity).
2. Check the following areas
CPU utilization and system load
Check idle %
Check load average
Load average should be > 1.5
Physical and virtual memory
Swap statistics
Slowest disk
LAN statistics

File System Checks


3.1

Check SAP R/3


This process will remove old data from the /usr/sap/<SID>/DXX/work directory
1. Remove old dev_w# files
Logon each application server through transaction SM51
Highlight one application server
Select Processes button
Follow path ProcessTraceReset files
2. Remove old dev_rd# files
Logon each application server through transaction SM51
Highlight one application server
Select Processes button
Run transaction /nSMLG
Follow path GotoTraceGatewayReset File
3. Remove old dev_rfc files
Run transaction SM59

Follow path RFCDelete Trace

You might also like