You are on page 1of 35

Agenda

Overview of Solaris monitoring build-in tools

Monitoring CPU problems

Memory Problems

Networking

New tools in Solaris 11

1/18/15
Where most folks start...CPU
Key Observables
> Utilization: usr/sys/idle
> CPU Wait-time: How long is something waiting for
the CPU
> Who/What is using the CPU
What to monitor
> Overall Utilization: vmstat
> Load Average: prstat, uptime
> Microstate Accounting: Latency wait state in
prstat
> Per-CPU Utilization: mpstat
> CPU by Process: prstat
> DTrace Analysis
Thread Microstates ( prstat -m )
Fine-grained state tracking for threads
> Off by default on Solaris 8 and 9
> On by default in Solaris 10
Reported microstates
> USR running in user mode
> SYS running in kernel mode
> TRP trap handling
> TFL test page faults
> DFL data page faults
> LCK user lock time
> SLP sleep
> LAT Runnable, waiting for a CPU
Disk I/O
Disk IO
Key Observables
> Disk response time
> Disk Utilization/Saturation
> Channel Utilization/Saturation
> Attribution: which file, process etc...

What to Measure
> Response times: iostat
> Utilization/Saturation: iostat
> Response times, processes, files etc: DTrace
Memory
Memory summary with mdb
Sparc Process address space
Executable text -- The executable instructions in the binary reside in
the text segment. The text segment is mapped from the on-disk binary
and is mapped read-only, with execute permissions.
Executable data -- The initialized variables in the executable reside
in the data segment. The data segment is mapped from the on-disk
binary and is mapped read/write/private. The private mapping ensures
that changes made to memory within this mapping are not reflected out
to the file or to other processes mapping the same executable.
Heap space -- Scratch, or memory allocated by malloc(), is allocated
from anonymous memory and is mapped read/write.
Process stack -- The stack is allocated from anonymous memory and
is mapped read/write.
Network
Key Observables

> Link Utilization


> Transmission, framing, checksum errors
> Upstream software congestion
> Routing
> Over the wire latency
NICSTAT
New nicstat version:
https://blogs.oracle.com/timc/entry/nicstat_the_solaris_and_linu
x
$ nicstat 5
Time Int rKB/s wKB/s rPk/s wPk/s rAvs wAvs %Util Sat
17:05:17 lo0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
17:05:17 e1000g0 0.61 4.07 4.95 6.63 126.2 628.0 0.04 0.00
17:05:17 e1000g1 225.7 176.2 905.0 922.5 255.4 195.6 0.33 0.00
Time Int rKB/s wKB/s rPk/s wPk/s rAvs wAvs %Util Sat
17:05:22 lo0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
17:05:22 e1000g0 0.06 0.15 1.00 0.80 64.00 186.0 0.00 0.00
17:05:22 e1000g1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Solaris 11 monitoring tools
30
Querying statistics examples

31
Querying statistics examples
(cont )

32
Zonestat
The zonestat utility reports on the CPU, memory, and resource
control utilization of the currently running zones. Each zone's
utilization is reported as a percentage of both system resources
and the zone's configured limits.

33
zonestat 5 2
Collecting data for first interval...
Interval: 1, Duration: 0:00:05
SUMMARY Cpus/Online: 32/32 Physical: 31.8G Virtual: 47.8G
----------CPU---------- ----PHYSICAL----- -----VIRTUAL-----
ZONE USED %PART %CAP %SHRU USED PCT %CAP USED PCT %CAP
[total] 0.10 0.31% - - 3109M 9.52% - 7379M 15.0% -
[system] 0.01 0.04% - - 2797M 8.57% - 7115M 14.5% -
global 0.08 0.51% - - 141M 0.43% - 129M 0.26% -
zoneA 0.00 0.02% - - 43.7M 0.13% - 35.4M 0.07% -
zoneB 0.00 0.02% - - 42.0M 0.12% - 32.8M 0.06% -
zoneC 0.00 0.04% - - 42.0M 0.12% - 32.8M 0.06% -
zoneD 0.00 0.02% - - 42.1M 0.12% - 33.2M 0.06% -
Interval: 2, Duration: 0:00:10
SUMMARY Cpus/Online: 32/32 Physical: 31.8G Virtual: 47.8G
----------CPU---------- ----PHYSICAL----- -----VIRTUAL-----
ZONE USED %PART %CAP %SHRU USED PCT %CAP USED PCT %CAP
[total] 0.09 0.30% - - 3109M 9.52% - 7379M 15.0% -
[system] 0.01 0.03% - - 2797M 8.57% - 7115M 14.5% -
global 0.08 0.51% - - 142M 0.43% - 129M 0.26% -
zoneA 0.00 0.02% - - 43.7M 0.13% - 35.4M 0.07% -
zoneB 0.00 0.02% - - 42.0M 0.12% - 32.8M 0.06% -
zoneC 0.00 0.02% - - 42.0M 0.12% - 32.8M 0.06% -
zoneD 0.00 0.02% - - 42.1M 0.12% - 33.2M 0.06% -

34
Latencytop
Is an observability tool that reports statistics about latencies in the
system and in applications
The following command launches the tool with default values for options.
% latencytop
The following command sets the sampling interval to two seconds.
% latencytop -t 2
The following command displays trace data for pid
% latencytop -s pid=630

35

You might also like