Professional Documents
Culture Documents
/* Document
: Short note on how to trace processes in UNIX.
*/
/* Doc. Version : 4
*/
/* File
: tracing.txt
*/
/* Purpose
: Some examples on how to trace processes in UNIX.
*/
/*
For the DBA working with databases on UNIX.
*/
/* Date
: 14/08/2009
*/
/* Compiled by : Albert van der Sel
*/
/************************************************************************/
============================================================================
1. First some info before you trace:
============================================================================
When you study your trace files, you may come accross a number of error messages
or error codes.
The errorcodes we mean here, are the codes that are also visible in the file "er
rno.h".
This is a header file in the standard library of C programming language.
Those are a subset of the codes that a program might get when it requests a serv
ice
from the system (like for example, "open file").
That's certainly is not all there is that you might run into about errors and co
rresponding codes,
but it constitues an important base of what you can encounter in traces.
Suppose you find something like this in a trace:
vnop_lookup(dvp = F100010034228BF8, flag = 0002) = 0002, *vpp = 0000
return from statx. error ENOENT [13 usec]
What can ENOENT mean? If you don't find some more "explaining text" 'close' to t
his line, then you can find
from the table below, that it means "No such file or directory".
Actually, I produced 2 lists, one from Linux and one from AIX,
just to prove they are quite the same (there is no garantee that they are *exact
ly* the same on all systems).
By the way, if you go search for that "errno.h" file (or similar name), on your
own system,
and take a look at the contents, you can create the list yourself for your parti
cular unix/linux system.
You can find that file (likely) in "/usr/include/sys"
But for easy reference, we list the most important errno's for 2 representative
unixes.
(Yes.. one listing would have been quite sufficient).
1.1 Errcodes Linux (generic) from errno.h :
===========================================
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
EPERM
ENOENT
ESRCH
EINTR
EIO
ENXIO
E2BIG
ENOEXEC
EBADF
ECHILD
EAGAIN
ENOMEM
EACCES
EFAULT
ENOTBLK
EBUSY
EEXIST
EXDEV
ENODEV
ENOTDIR
EISDIR
EINVAL
ENFILE
EMFILE
ENOTTY
ETXTBSY
EFBIG
ENOSPC
ESPIPE
EROFS
EMLINK
EPIPE
EDOM
ERANGE
EDEADLK
ENAMETOOLONG
ENOLCK
ENOSYS
ENOTEMPTY
ELOOP
EWOULDBLOCK
ENOMSG
EIDRM
ECHRNG
EL2NSYNC
EL3HLT
EL3RST
ELNRNG
EUNATCH
ENOCSI
EL2HLT
EBADE
EBADR
EXFULL
ENOANO
EBADRQC
EBADSLT
EDEADLOCK
EBFONT
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
EAGAIN
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
EDEADLK
59
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
es */
#define
#define
#define
/
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
*/
#define
#define
#define
#define
#define
#define
#define
/
#define
#define
#define
#define
#define
#define
*/
#define
#define
#define
#define
#define
#define
ENOSTR
ENODATA
ETIME
ENOSR
ENONET
ENOPKG
EREMOTE
ENOLINK
EADV
ESRMNT
ECOMM
EPROTO
EMULTIHOP
EDOTDOT
EBADMSG
EOVERFLOW
ENOTUNIQ
EBADFD
EREMCHG
ELIBACC
ELIBBAD
ELIBSCN
ELIBMAX
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
ELIBEXEC
EILSEQ
ERESTART
83
84
85
ESTRPIPE
EUSERS
ENOTSOCK
EDESTADDRREQ
EMSGSIZE
EPROTOTYPE
ENOPROTOOPT
EPROTONOSUPPORT
ESOCKTNOSUPPORT
EOPNOTSUPP
86
87
88
89
90
91
92
93
94
95
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
EPFNOSUPPORT
EAFNOSUPPORT
EADDRINUSE
EADDRNOTAVAIL
ENETDOWN
ENETUNREACH
ENETRESET
96
97
98
99
100
101
102
/*
/*
/*
/*
/*
/*
/*
ECONNABORTED
ECONNRESET
ENOBUFS
EISCONN
ENOTCONN
ESHUTDOWN
103
104
105
106
107
108
/*
/*
/*
/*
/*
/*
ETOOMANYREFS
ETIMEDOUT
ECONNREFUSED
EHOSTDOWN
EHOSTUNREACH
EALREADY
109
110
111
112
113
114
/*
/*
/*
/*
/*
/*
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
EINPROGRESS
ESTALE
EUCLEAN
ENOTNAM
ENAVAIL
EISNAM
EREMOTEIO
EDQUOT
ENOMEDIUM
EMEDIUMTYPE
115
116
117
118
119
120
121
122
123
124
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
The list above should actually be sufficient, but we shall show next, the corres
ponding
list for AIX:
1.2 errcodes AIX:
=================
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
EPERM 1
/* Operation not permitted
ENOENT 2
/* No such file or directory
ESRCH 3
/* No such process
EINTR 4
/* interrupted system call
EIO
5
/* I/O error
ENXIO 6
/* No such device or address
E2BIG 7
/* Arg list too long
ENOEXEC 8
/* Exec format error
EBADF 9
/* Bad file descriptor
ECHILD 10
/* No child processes
EAGAIN 11
/* Resource temporarily unavailable
ENOMEM 12
/* Not enough space
EACCES 13
/* Permission denied
EFAULT 14
/* Bad address
ENOTBLK 15
/* Block device required
EBUSY 16
/* Resource busy
EEXIST 17
/* File exists
EXDEV 18
/* Improper link
ENODEV 19
/* No such device
ENOTDIR 20
/* Not a directory
EISDIR 21
/* Is a directory
EINVAL 22
/* Invalid argument
ENFILE 23
/* Too many open files in system
EMFILE 24
/* Too many open files
tr 25
/* Inappropriate I/O control operation */
ETXTBSY 26
/* Text file busy
EFBIG 27
/* File too large
ENOSPC 28
/* No space left on device
ESPIPE 29
/* Invalid seek
EROFS 30
/* Read only file system
EMLINK 31
/* Too many links
EPIPE 32
/* Broken pipe
EDOM
33
/* Domain error within math function
ERANGE 34
/* Result too large
ENOMSG 35
/* No message of desired type
EIDRM 36
/* Identifier removed
ECHRNG 37
/* Channel number out of range
EL2NSYNC 38
/* Level 2 not synchronized
EL3HLT 39
/* Level 3 halted
*/
*/
*/
*/
*/
*/
*/
*/
*/
*/
*/
*/
*/
*/
*/
*/
*/
*/
*/
*/
*/
*/
*/
*/
*/
*/
*/
*/
*/
*/
*/
*/
*/
*/
*/
*/
*/
*/
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
ly */
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
EL3RST 40
ELNRNG 41
EUNATCH 42
ENOCSI 43
EL2HLT 44
EDEADLK 45
ENOTREADY
EWRPROTECT
EFORMAT
ENOLCK
ENOCONNECT
ESTALE
EDIST
EINPROGRESS
EALREADY
ENOTSOCK
EDESTADDRREQ
EDESTADDREQ
EMSGSIZE
EPROTOTYPE
ENOPROTOOPT
EPROTONOSUPPORT
ESOCKTNOSUPPORT
EOPNOTSUPP
EPFNOSUPPORT
EAFNOSUPPORT
/* Level 3 reset
*/
/* Link number out of range
*/
/* Protocol driver not attached
*/
/* No CSI structure available
*/
/* Level 2 halted
*/
/* Resource deadlock avoided
*/
46
/* Device not ready
*/
47
/* Write-protected media
*/
48
/* Unformatted media
*/
49
/* No locks available
*/
50
/* no connection
*/
52
/* no filesystem
*/
53
/* old, currently unused AIX errno*/
55
/* Operation now in progress */
56
/* Operation already in progress */
57
/* Socket operation on non-socket */
58
/* Destination address required */
EDESTADDRREQ /* Destination address required */
59
/* Message too long */
60
/* Protocol wrong type for socket */
61
/* Protocol not available */
62
/* Protocol not supported */
63
/* Socket type not supported */
64
/* Operation not supported on socket */
65
/* Protocol family not supported */
66
/* Address family not supported by protocol fami
EADDRINUSE
EADDRNOTAVAIL
ENETDOWN
ENETUNREACH
ENETRESET
ECONNABORTED
ECONNRESET
ENOBUFS
EISCONN
ENOTCONN
ESHUTDOWN
ETIMEDOUT
ECONNREFUSED
EHOSTDOWN
EHOSTUNREACH
ERESTART
EPROCLIM
EUSERS
ELOOP
ENAMETOOLONG
EDQUOT
ECORRUPT
EREMOTE
ENOSYS
EMEDIA
ESOFT
ENOATTR
ESAD
ENOTRUST
ETOOMANYREFS
EILSEQ
ECANCELED
ENOSR
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
88
89
93
109
110
111
112
113
114
115
116
117
118
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
ETIME
EBADMSG
EPROTO
ENODATA
ENOSTR
ECLONEME
ENOTSUP
EMULTIHOP
ENOLINK
EOVERFLOW
119
/* I_STR ioctl timed out */
120
/* wrong message type at stream head */
121
/* STREAMS protocol error */
122
/* no message ready at stream head */
123
/* fd is not a stream */
ERESTART /* this is the way we clone a stream ... */
124
/* POSIX threads unsupported value */
125
/* multihop is not allowed */
126
/* the link has been severed */
127
/* value too large to be stored in data type */
============================================================================
2. A quick one: The "truss" tool on many unixes:
============================================================================
Here is a quick one to trace a shell script, or executable program: using "truss
".
The "truss" tool is available on many unix platforms. It has many options, but a
very usefull command
to trace the system calls that a script or program does is:
$ truss -o /tmp/myprg.log myprg
In this example, truss will log in the file "/tmp/myprg.log" while it traces the
program "myprg".
Ofcourse, you can choose another path and logfile to trace to.
The upper command is quite good for tracing a shell script, or program, that sta
rts up, does some work,
and then terminates. If an error occurs during runtime, it's likely that you fin
d some pointers
in the logfile that truss made for you.
This tool has so many options, for example, you can focus your trace on a certai
n library etc..
Anyway, even the upper example of truss can already be very helpfull.
So, for example, if you find in the log that truss has produced, the error "EACC
ES" which is
"errno 13 = Permission denied", that would really be helpfull. Obviously, your s
hell script or
program tries to access a certain object, to which it has insufficient permision
s,
and thus may fail.
Be warned though, that some errno's might be found multiple times, while it's ac
tually not
something to worry about. For example "ENOENT= No such file or directory" might
be found
quite often. Here, your script or program seems to be unable to find a file or d
irectory.
Well, if it's related to the $PATH environment variable, it could be quite reaso
nable.
Your shell will search your $PATH from beginning, to the end, until the object h
as been found.
Thus, it's quite possible that some ENOENT errors occurred.
In section 4.2 you can find some more info on truss.
============================================================================
3. Tracing in Linux:
============================================================================
3.1.strace:
===========
>>> strace example on Linux:
One main trace utility on most Linux distro's, is the "strace" command.
You can use it with many parameters, but the "-o outputfile" is very important,
in order to save the output to a file.
Use it like:
# strace -o logfile <name_of_command_or_program_you_want_to_trace>
# strace -o logfile -p <process_id>
cess that is already running,
Because strace will show you the systemcalls and signals, you can use it to reve
al whether a program cannot
find a file, or does not have permissions to read (or write to) a file. In such
a case, a program might fail.
Example 1:
---------Suppose we have a file called "/etc/security.conf". Now we run a utility to read
the file (like cat, pg, more, less etc..)
as a normal user, which user does not have permissions to read the file. Let's t
race that event to a logfile, and see
what we can discover.
$ strace -o strace_example.log less /etc/security.conf
A trace file can get pretty long, but you should just browse it and be alert on
what seems to be an error reported.
So, if we take a look in the logfile "strace_example.log"
..
..
open("/etc/security.conf", O_RDONLY|O_LARGEFILE) = -1 EACCES (Permission denie
d)
write(2, "/etc/security.conf: Permission denied\n", 32) = 32
..
..
We can clearly see, that our program failed due to lack of permission.
Example 2:
---------You can use strace in many ways. One other famous "error" you might find using s
trace, is that a program needs a libary,
but can't find it.
Like in this example;
..
open("/opt/tux/cbl/lib/libdcpybk.so", O_RDONLY) = -1
ENOENT (No such file or directory)
..
Remark:
To find out what libraries a program needs, you might also try the ldd command.
For example, what uuencode needs is shown with:
$ ldd uuencode
uuencode needs:
/usr/lib/libc.a(shr.o)
/unix
/usr/lib/libcrypt.a(shr.o)
3.2. ltrace:
============
While "strace" deals with systemcalls, if you want to track what library calls a
n application does,
you can use the "ltrace" command.
It works really similar to "strace".
Example:
$ ltrace -o ls_example_trace_file.trc ls
# cat /proc/meminfo
-- cpu info:
# cat /proc/cpuinfo
-- user and process limits:
Sometimes, when a process runs under some account, and it fails for no immediate
reason, it might be
worth checking the "ulimit" of that account (like max filesize, max open files,
number of files etc..)
use it under that account as:
# ulimit (-a)
-- Show processtree of parent and children:
# pstree pid
# m in MB; k in KB
If there are many filesystems, you might want to see just the top 5 that are the
lowest on free space:
# df -k |awk '{print $4,$7}' |grep -v "Filesystem" | sort -n | tail -5
-- How to become another user, or possibly root:
# su - accountname
# su -
============================================================================
4. Tracing in AIX:
============================================================================
In AIX, tracing commands are available like "truss", "syscalls" and "trace".
First we will talk about the "trace" facility, to which AIX also offers a userfr
iendly
interface. It's a menu based system (via smitty). But you can use "trace" on the
commandline as well.
The neat thing here is that you can trace a PID, a program, or just all.
We will start with the command "smitty trace". We will instruct the system to cr
eate
a raw tracefile first (not easily readable), and then, after we have stopped tra
cing, we create
an ascii (readable) file, from the raw file.
START Trace
STOP Trace
Generate a Trace Report
Manage Trace
Manage Event Groups
First we choose "START Trace"
The following menu appears:
FIG. 1.
EVENT GROUPS to trace
ADDITIONAL event IDs to trace
Event Groups to EXCLUDE from trace
Event IDs to EXCLUDE from trace
->Process IDs to Trace
Program to Trace
Propagate Tracing to
[Entry Fields]
[]
[]
[]
[]
[]
[]
[new processes and threads]
Trace MODE
[alternate]
[no]
LOG FILE
SAVE PREVIOUS log file?
Omit PS/NM/LOCK HEADER to log file?
Omit DATE-SYSTEM HEADER to log file?
Run in INTERACTIVE mode?
Trace BUFFER SIZE in bytes
LOG FILE SIZE in bytes
Buffer Allocation
[/var/adm/ras/trcfile]
[no]
[yes]
[no]
[no]
[262144]
[2621440]
[automatic]
[Entry Fields]
[]
[]
[]
[]
[]
[]
[new processes and threads]
Trace MODE
[alternate]
[yes]
LOG FILE
SAVE PREVIOUS log file?
Omit PS/NM/LOCK HEADER to log file?
Omit DATE-SYSTEM HEADER to log file?
Run in INTERACTIVE mode?
Trace BUFFER SIZE in bytes
LOG FILE SIZE in bytes
to 100MB)
#
Buffer Allocation
[/tmp/trcraw]
[no]
[yes]
[no]
[no]
[262144]
[104857600]
(changed
[automatic]
Next, move to
- "STOP when log file full?"
Decide whether you want to stop logging when the size limit has been reached (ge
nerally a good idea).
You can choose between "yes" and "no" via the F4 key.
Next, we move to
- "EVENT GROUPS to trace":
When you have your cursor at this item, press F4. An impressive list of "counter
o trace
EVENT GROUPS t
Program to Trace
execs,dispatches) (reserved)
Propagate Tracing to
TEM CALLS (reserved)
Trace MODE
LOG FILE
[TOP]
Buffer Allocation
d)
d)
CK EVENTS (reserved)
TER (reserved)
d)
served)
d)
eserved)
ed)
d)
ADAPTERS (reserved)
served)
ved)
NC CONTROLLER (reserved)
eserved)
Mode (reserved)
(reserved)
[MORE...36]
F1=Help
F3=Cancel
F10=Exit
F2 F7=Select
F4=List
F6 Enter=Do
F8=Image
F1=Help
F5=Reset
n=Find Next
F2=Refresh
F8=Image
/=Find
Now the trace wil start and you should see the file "/tmp/trcraw" grow in size.
You can see that with:
$ ls -al /tmp/trcraw
Also, try this command from the prompt:
$ ps -ef | grep trace
and you should see your trace running in the process list.
IMPORTANT:
Did you note, that we did not select a PID (process ID) to trace on? So, actuall
y, we trace on (almost) all processes,
"which do something" on the eventgroups we selected.
Ofcourse, if you know a PID on which you want to trace, you just fill that in th
e menu shown in Fig. 2.
If you select to trace on a PID (only), the your tracefile will ofcourse not gro
w that fast, as it would in our example.
But even in our example (where we trace on all processes on the selected eventgr
oups), we can see marvelous things.
Suppose Oracle and/or Websphere, or monitoring tools, (or you name it), are runn
ing. Later on, when you inspect the tracefile,
you can find very valuable information about what those processes do "under the
hood".
Remember, we are creating a raw trace file here. We still need to do one extra s
tep, after stopping the trace.
>>> Stop the trace and create a readable file:
---------------------------------------------Ok, if you have left smitty, start it up again.
$ smitty trace
In the menu that follows, just select " STOP Trace".
START Trace
STOP Trace
Generate a Trace Report
Manage Trace
Manage Event Groups
and the trace facility will stop tracing.
Next, we want to have a readable file, which we can view (use cat, pg, more, gre
p etc..).
In smitty, there are options available to create a trace report, but I think it'
s more instructive
to do this from the prompt. Here we go:
We have a raw trace in the file /tmp/trcraw
Lets create a readable file from the raw file, and call it "/tmp/trctxt".
You can do that with for example:
$ trcrpt -O pid=on,exec=on trcraw > trcnew
Please be aware that the textfile is typically 2 or 3 times larger than the raw
file. So, always be aware on available
space in the filesystem where you want to create the file.
Now you can open the file, or grep it on an identifier etc..
4.075516073
0.009374
fstatx LR
4.075520760
0.004687
return fro
104 ksh
1175648
m __loadx [1 usec]
101 ksh
1175648
= D03B8288
104 ksh
1175648
m __loadx [1 usec]
101 ksh
1175648
= D03BF94C
104 ksh
1175648
m getuidx [1 usec]
4.074383326
0.001349
return fro
4.074383929
0.000603
__loadx LR
4.074385155
0.001226
return fro
4.074395702
0.010547
getuidx LR
4.074396214
0.000512
return fro
>>> statx, stat, lstat, fstatx, fstat, fullstat, ffullstat, stat64, lstat64, fst
at64, stat64x, fstat64x, or lstat64x Subroutine
Purpose
Provides information about a file or shared memory object.
Library
Standard C Library (libc.a)
>>> vnop_open Entry Point
Purpose
Requests that a file be opened for reading or writing.
0 Indicates success.
Nonzero return values are returned from the /usr/include/sys/errno.h file to ind
icate failure.
>>> getuid, geteuid, or getuidx Subroutine
Purpose
Gets the real or effective user ID of the current process.
Library
Standard C Library (libc.a)
n denied".
So, suppose that you go to "/opt/app/etc/" and check the permissions on the file
"cc.conf", you would find
that the permission on that file should be altered.
After using the following command:
$ chmod g+r cc.conf
# here we give the group read permission
on "cc.conf"
Now the program runs without errors. Probably this was a program that first want
ed to read configuration information
from "/opt/app/etc/cc.conf", and if that fails, the program would just terminate
without any message.
Ofcourse, that program could have been designed much better.
But we have seen an example where truss was of use.
Example 2:
---------Let's run the program "lsps -s" (show pagingspace) from my home dir, and let's t
russ it, to see what systemcalls it makes:
albert@sharky:/home/albert $ truss lsps -s
execve("/usr/sbin/lsps", 0x2FF22A4C, 0x2000EB28) argc: 2
__loadx(0x03000000, 0x2FF22870, 0x000000F0, 0x10000000, 0x20000E14)
__loadx(0x0A040000, 0xD0572CD4, 0x0000000A, 0x00000000, 0x00000000)
sbrk(0x00000000)
= 0x20004570
vmgetinfo(0x2FF21C30, 7, 16)
= 0
sbrk(0x00000000)
= 0x20004570
__libc_sbrk(0x00000000)
= 0x20004570
getuidx(4)
= 6318
getuidx(2)
= 6318
getuidx(1)
= 6318
getgidx(4)
= 1105
getgidx(2)
= 1105
getgidx(1)
= 1105
__loadx(0x01000080, 0x2FF216E0, 0x00000960, 0x2FF22160, 0x00000000)
__loadx(0x0A040000, 0xD0572CA0, 0x2FF22FFC, 0x0000D0B2, 0x00000000)
__loadx(0x01000180, 0x2FF216E0, 0x00000960, 0xF028CC4C, 0xF028CB7C)
__loadx(0x0A040000, 0xD0572CA0, 0x2FF22FFC, 0x0000D0B2, 0x00000000)
__loadx(0x07080000, 0xF028CC1C, 0xFFFFFFFF, 0xF03358D8, 0x00000000)
__loadx(0x07080000, 0xF028CB5C, 0xFFFFFFFF, 0xF03358D8, 0x00000000)
__loadx(0x07080000, 0xF028CC2C, 0xFFFFFFFF, 0xF03358D8, 0x00000000)
__loadx(0x07080000, 0xF028CB6C, 0xFFFFFFFF, 0xF03358D8, 0x00000000)
__loadx(0x07080000, 0xF028CBEC, 0xFFFFFFFF, 0xF03358D8, 0x00000000)
__loadx(0x07080000, 0xF028CB8C, 0xFFFFFFFF, 0xF03358D8, 0x00000000)
__loadx(0x07080000, 0xF028CBFC, 0xFFFFFFFF, 0xF03358D8, 0x00000000)
__loadx(0x07080000, 0xF028CC0C, 0xFFFFFFFF, 0xF03358D8, 0x00000000)
__loadx(0x07080000, 0xF028CB9C, 0xFFFFFFFF, 0xF03358D8, 0x00000000)
__loadx(0x07080000, 0xF028CBAC, 0xFFFFFFFF, 0xF03358D8, 0x00000000)
getuidx(4)
= 6318
getuidx(2)
= 6318
getuidx(1)
= 6318
getgidx(4)
= 1105
getgidx(2)
= 1105
getgidx(1)
= 1105
__loadx(0x01000080, 0x2FF216E0, 0x00000960, 0x2FF22160, 0x00000000)
getuidx(4)
= 6318
= 0x00000000
= 0x00000000
=
=
=
=
=
=
=
=
=
=
=
=
=
=
0xD0149130
0x00000000
0xF03358D8
0x00000000
0xF0336808
0xF0336814
0xF0336844
0xF0336850
0xF0336820
0xF0336838
0xF033685C
0xF033688C
0xF0336874
0xF0336910
= 0xD0149130
getuidx(2)
= 6318
getuidx(1)
= 6318
getgidx(4)
= 1105
getgidx(2)
= 1105
getgidx(1)
= 1105
__loadx(0x01000080, 0x2FF216E0, 0x00000960, 0x2FF22160, 0x00000000)
getuidx(4)
= 6318
getuidx(2)
= 6318
getuidx(1)
= 6318
getgidx(4)
= 1105
getgidx(2)
= 1105
getgidx(1)
= 1105
__loadx(0x01000080, 0x2FF216E0, 0x00000960, 0x2FF22160, 0x00000000)
getuidx(4)
= 6318
getuidx(2)
= 6318
getuidx(1)
= 6318
getgidx(4)
= 1105
getgidx(2)
= 1105
getgidx(1)
= 1105
__loadx(0x01000080, 0x2FF216E0, 0x00000960, 0x2FF22160, 0x00000000)
getuidx(4)
= 6318
getuidx(2)
= 6318
getuidx(1)
= 6318
getgidx(4)
= 1105
getgidx(2)
= 1105
getgidx(1)
= 1105
__loadx(0x01000080, 0x2FF216E0, 0x00000960, 0x2FF22160, 0x00000000)
access("/usr/lib/nls/msg/en_US/cmdps.cat", 0) = 0
_getpid()
= 483490
psdanger(0)
= 524288
psdanger(-1)
= 521468
open("/usr/lib/nls/msg/en_US/cmdps.cat", O_RDONLY) = 3
kioctl(3, 22528, 0x00000000, 0x00000000)
Err#25 ENOTTY
kfcntl(3, F_SETFD, 0x00000001)
= 0
kioctl(3, 22528, 0x00000000, 0x00000000)
Err#25 ENOTTY
kread(3, "\0\001 \001\001 I S O 8".., 4096)
= 4096
lseek(3, 0, 1)
= 4096
lseek(3, 0, 1)
= 4096
lseek(3, 0, 1)
= 4096
_getpid()
= 483490
lseek(3, 0, 1)
= 4096
kioctl(1, 22528, 0x00000000, 0x00000000)
= 0
Total Paging Space Percent Used
kwrite(1, " T o t a l P a g i n g".., 34)
= 34
2048MB
1%
kwrite(1, "
2 0 4 8 M B".., 30)
= 30
__loadx(0x04000000, 0x2FF22080, 0x00000800, 0x0000D0B2, 0x00000000)
kfcntl(1, F_GETFL, 0x00000001)
= 67110914
kfcntl(2, F_GETFL, 0xF02DF418)
= 67110914
_exit(0)
= 0xD0149130
= 0xD0149130
= 0xD0149130
= 0xD0149130
= 0x00000000
There is a lot of output on the screen. I entered "lsps -s", and truss will watc
h what syscalls are done
and shows that on your screen.
In fact, many of the first lines deal with "getuidx" and that kind of calls. The
system would like to know
who (and in what groups he/she is) issued the command.
You can ignore the output, because it's not that interresting. I only "published
" it here, to give you an
idea on how much output those tracing commands (like truss) generates.
If I just want to store that information to a logfile, for example "truss.log",
I would use the following command:
albert@sharky:/home/albert $ truss -o truss.log lsps -s
# with SP, TL
-- Show the jobs that are scheduled (in the account you use) from cron:
# crontab -l
-- What are the standard mounted filesystems?: That's defined in "/etc/filesyste
ms"
# cat /etc/filesystems | more
-- Which processes are using a certain filesystem?
# fuser -c /filesystem
acle
bootinfo -r
lsattr -E -l mem0
lsattr -E -l sys0 -a realmem
svmon -G
vmstat -v
vmo -L
# ( lots of output )
svmon -U -g -t 10
# ( top 10 users paging space)
-- Swap usage:
# lsps -s
than 75% used? Oh boy!
# pstat -s
-- cpu info:
#
#
#
#
#
lparstat (-i)
prtconf | grep proc
pmcycles -m
lscfg | grep proc
pstat -S
-- ulimit:
Sometimes, when a process runs under some ones credentials, and it fails for no
immediate reason, it might be
worth checking the "ulimit" of that account (like max filesize, max open files,
number of files etc..)
use it under that account as:
# ulimit -a
-- Show process tree of parent and children:
# proctree pid
# m in MB; k in KB; g in GB
If there are many filesystems, you might want to see just the top 5 that have th
e lowest on free space:
# df -k |awk '{print $4,$7}' |grep -v "Filesystem" | sort -n | tail -5
-- How to become another user, or possibly root:
# su - accountname
# su -
# kill -9 PID
# carefull, don't kill the wrong one; not recommended
unless you don't have a choice.
-- Carefull!! How to kill all your processes "the hard way", all at once?
# kill -9 -1
have a choice.
# killall
have a choice.
-- To clear out unused system modules (currently unused modules in kernel and li
brary memory):
# slibclean
============================================================================
5. Solaris:
============================================================================
A similar "story" will be put here, but then ofcourse for Solaris.
============================================================================
6. Other:
============================================================================
6.1 Some trivial remarks:
=========================
Now for some really really trivial remarks......
(Sorry !)
- kernel parameters
If you have problems installing a program, or if fails to run properly, are you
sure all
required kernel parameters have been set?
- Environment variables
If you have problems installing a program, or if fails to run properly, are you
sure all
required Environment variables have been set?
Many "large" programs really have an impressive list of variables you need to se
t in place
before it will run properly.
- Dependencies on other stuff.
Most (commercial) programs depend heavily on installed support programs or tools
, like perl, java, etc..
They may even have very strict requirements on versions of those support program
s.
- Cluttered memory (ipc identifiers, semaphores, shared memory)
If you have started an application, and terminated it roughly, it's possible tha
t
"stuff" still remains in memory.
In such a case, it's possible that your app will not be able to restart.
You need to use a tool like "ipcrm" to clean memory, or
you might even consider to reboot the system.