Professional Documents
Culture Documents
Problem Solving Welcome to Sun Microsystems Software Support for Application Server, Web Server and Java Problems
To aid you in problem solving, this document contains the following:
Tips for Troubleshooting - This section provides tips to help you diagnose problems you maybe having with Software products prior to opening a support incident with Software Support. Checklist for Opening New Support Cases - The checklist will help you to gather the relevant information to include when you open a new support incident with Software Support. Sample Support Case entries - You may like to use these samples as a template for logging support cases with Sun Support.
Please review this document prior to opening new issues with Software Support. For a list of telephone numbers to Support, see http://www.sun.com/contact/support.jsp For access to logging support requests over the web, see https://osc-amer.sun.com/OSCSW/svcportal?pageName=clselection and select the country you are in to proceed
Page 1 Of 23
Table of Contents
Introduction .......................................................................................................................................... 3 Tips for Troubleshooting.......................................................................................................................4 Checklist for Opening New Support Cases ..........................................................................................7 Sample Support Case Entries................................................................................................................ 8 Recommended Data Collection for Crashes and Hangs in the Application Server ( Solaris ) .......... 12 Recommended Data Collection for Crashes and Hangs in the Application Server ( Windows ) ...... 13 Recommended Data Collection for Crashes and Hangs in Application Server ( Linux ) ..................18 Recommended Data Collection for HADB issue ...............................................................................19 Recommended Data Collection for Crashes and Hangs in Web Server ............................................ 20 Java Self-help ..................................................................................................................................... 21 Recommended Data Collection for Crashes and Hangs in Java (Solaris) ......................................... 22 Recommended Data Collection for Crashes and Hangs in Java (Linux) ........................................... 22 Recommended Data Collection for Crashes and Hangs in Java (Windows) ..................................... 22
Page 2 Of 23
Introduction
This document aims to assist Sun Customers with gathering the information & data required to log a support case with Sun Support for the SunOne Application Server, Sun Web Server and Java. The more relevant information & data gathered prior to or at the time of logging a support case the quicker we should be able to help resolve your issues. The process of diagnosis is both cooperative and iterative. Sun Support relies on you to collect data as well as test various things during the course of a problem investigation. The entire process starts with a problem description, this is very important because it not only gives us a starting point but it also gives us an end point. The more definitive the description the better; tell us what the problem is and why it is a problem. The problem description should include: what product(s) is/are experiencing the problem, What is the configuration, what other products are involved, What you expected/should be happening, What is actually happening, Why this is a problem, If it's a new installation/configuration or an existing setup, When the problem first appeared and anything else that happened around that time. The following pages aim to assist you with gathering the data which is most commonly required to start the process of investigating and diagnosing your problem. If have suggestions or comments about this document, it's contents or format, please share them with us by sending us e-mail to the following email address : Support_feedback@sun.com
Page 3 Of 23
Page 4 Of 23
Self-help Verify write permission in current working directory and core file Verify enough space for core file Verify ulimit -a shows "unlimited" for coredump Verify coreadm has per-process core dumps enabled The coreadm output is incorrect Run coreadm(1M) and make the following recommended settings: #mkdir -p /var/cores #coreadm -g /var/cores/%f.%n.%p.core -e global -e process -e global-setid -e proc-setid -e log #coreadm global core file pattern: /var/cores/%f.%n.%p.core init core file pattern: core global core dumps: enabled per-process core dumps: enabled global setid core dumps: enabled per-process setid core dumps: enabled global core dump logging: enabled Java specific 1: The process received SIGSEGV or SIGILL but no core dump produced! May be the process handled it. For example, HotSpot VM uses the SIGSEGV signal for legitimate purposes such as throwing NullPointerException, deoptimization etc. Not all SIGSEGVs are bad! Only if the current instruction (PC) falls outside JVM generated code, the signal is unhandled by the JVM. Only in such cases, the JVM HotSpot would dump core. Java specific 2: The JNI Invocation API was used to create the VM. The standard Java launcher was not used. The custom Java launcher program handled the signal by just consuming it and produced the log entry silently. This has been seen with certain AppServers, WebServers. These JVM embedding programs transparently attempt to re-start (fail over) the system after abnormal termination. Not producing a core is a feature and not a bug in this case The above two points are taken from section 3.2.3 in http://java.sun.com/j2se/1.5/pdf/jdk50_ts_guide.pdf The coreadm output is correct but there is no core file generated when the application server or web server running in SSL mode crashed. For application server - put the following statement "SSL_DUMP=1; export SSL_DUMP" in the startserv script of ( DAS or instance ) For web server issue documentation
Page 5 Of 23
http://sunsolve.sun.com/search/printfriendly.do?assetkey=1-9-63420-1 http://sunsolve.sun.com/search/printfriendly.do?assetkey=1-25-72079-1 Java thread dumps are not generated while performing "Ctrl and \ keys" are pressed, "kill -3" or "kill -QUIT" command. . Ensure the -Xrs option is not enabled in the Java flags/switch For application server 9.x, please use the following command : asadmin generate-jvm-report On windows the equivalent key sequence is the Ctrl and Break keys. If you can't get to the console, please install http://www.latenighthacking.com/projects/2003/sendSignal http://www.adaptj.com/webstart/stacktrace/app/launch.jnlp
Page 6 Of 23
List SUN Software Product involved ( eg Application Server etc ): List NON-SUN Third Party Software Product involved ( eg Oracle RDMS ) : Hardware Platform: OS and Kernel Version: Solaris : uname -a HP-UX : uname -r Linux : uname -a and more /etc/*-release Windows : C:\Program Files\Common files\Microsoft Shared\MSInfo\msinfo32.exe /report C:\report.txt Patch Level: Solaris : showrev -p HP-UX :swlist Linux : rpm -qa Windows : Already provided in the C:\report.txt file above. Description of issue: Frequency of issue occurs: When was the problem first noticed: Recently Changed Variables: Copy and paste error messages from server log outputs, if Any : Steps to Reproduce Issue: If it is a crash and hang incident, see the Section for "Recommended Data Collection" Scripts for collection of core file Solaris PkgApp http://www.sun.com/bigadmin/jsp/descFile.jsp?url=descAll/sun_gdd__directory_ __ Linux PkgApp http://www.sun.com/bigadmin/jsp/descFile.jsp?url=descAll/sun_gdd__all__pkg_ a
Page 7 Of 23
Page 8 Of 23
Date created : Sep 2008 version 1 <property name="docroot" value="${com.sun.aas.instanceRoot}/docroot"/> <property name="accesslog" value="${com.sun.aas.instanceRoot}/logs/access"/> </virtual-server> ************************* <virtual-server hosts="onlinehelp-dev " http-listeners="http-listener-1,http-listener-2" id="onlinehelp-dev" log-file="${com.sun.aas.instanceRoot}/logs/server.log" state="on"> <property name="sso-enabled" value="false"/> <property name="docroot" value="${com.sun.aas.instanceRoot}/docroot"/> <property name="accesslog" value="${com.sun.aas.instanceRoot}/logs/access"/> </virtual-server> ************************** Basically, both of these log files have the logging level set to FINE for the PERSISTENCE and UTILS. The two test cases executed include the following steps: gls_server.log_2008-06-17_dev - Log file for a newly created domain in the same SJSAS instance. gls_server.log_2008-06-17_domain1 - Log file for the domain in the instance that is causing the problem. However, this works fine if I use asadmin CLI: Initial deploy (contextroot provided): asadmin deploy --host applyonline-dev --port 4848 --user admin --passwordfile pw.txt --virtualservers applyonline-dev --contextroot / ProgEnrol-war-2.0.3-SNAPSHOT.war Redeploy: asadmin deploy --host applyonline-dev --port 4848 --user admin --passwordfile pw.txt --virtualservers applyonline-dev --force=true ProgEnrol-war-2.0.3-SNAPSHOT.war ========= As of now, I believe we have two work-arounds available for 9.1 U1: 1. Use the CLI with an explicit virtual server argument. 2. Use the GUI to first do an explicit Undeploy and then a new Deploy. operation ensure that a single virtual server is explicitly specified. I hope that this information helps to progress the case further. As part of the Deploy
Page 9 Of 23
1. 2. 3. 4. 5. 6.
Install Sun Java System Application Server PE 8.1 2005Q2 UR2 on Windows 2003 Server. The Application Server needs to be maintained by different team members (at least 3) User-1 logs-off from Windows Machine after Shift change to allow user-2 to login User-2 logs into the system Notice that Application Server is down itself Need to start Application Server manually by user-2 again.
Page 10 Of 23
Install WebServer: 6.1 SP4 Install Application Server: SJSAS: 8.1-2005Q1 Configure load-balancing Experience the problem
Page 11 Of 23
Recommended Data Collection for Crashes and Hangs in the Application Server ( Solaris )
Complete Application server version
<appserver_install-directory>/bin/asadmin version --verbose
Application Server Process Hangs/High CPU prstat -L >> output.txt pstack [pid] > pstack.out same for pmap, pldd, pfiles, pflags Time stamp of the issue. server.logs that capture the problem occurrences Detail the load/number of concurrent users at that time of the problem. Does restarting the server resolve the issue? After a fresh restart of the server, for how long does it run without problems? For 7.x and 8.x Issue kill -3 command for 3 times successively with an interval of a minute. This will create the java thread dump in the server.log file. Please issue kill -3 [pid] command ONLY during the time of issue. For 9.x Please use the command : asadmin generate-jvm-report Issue the command a few times over a period of time. Please provide us the server.log containing the thread dump.
<appserver_install-directory>/domains/<domain_name>/logs/server.log <appserver_installdirectory>/nodeagents/<nodeagent_name>/<instance_name>/logs/server.log
Run gcore on the pid. It will dump a core file of the process. Run the pkgapp script http://www.sun.com/bigadmin/jsp/descFile.jsp?url=descAll/sun_gdd__directory___on this core file from the same system where core was generated and provide the generated files as listed below. ./pkgapp.sh corefile casenumber_libraries.tar.gz casenumber_corefile.tar.gz Please upload the above files in https://supportuploads.sun.com/upload
Page 12 Of 23
under the cores directory and provide us the checksum details of the uploaded files. To track the memory usage provide the gclogs using the following Java Settings : -XX :+PrintGCTimeStamps -XX:+PrintGCDetails -Xloggc:gclog.txt (JDK1.4.2.x and above)
Alternatively, you can run the following data collection script from http://www.sun.com/bigadmin/jsp/descFile.jsp?url=descAll/sun_gdd_appserv_han when the application server goes into an unresponsive situation. Customers are advised to familiarise themselves with the script in a staging environment first before attempting in a production environment. For more information on the Sun Gathering Debug Data ( GDD ), please refer to the following website http://www.sun.com/service/gdd/index.xml.
Application Server Process Crash Check for core file, if there is no core file generated check the coreadm section as mentioned above. Check for any hs_err_pid.log. By default, the file is created in the working directory of the process When was the issue noticed first?(Please provide the exact time stamp of the issue). Does this issue happen during high load or any particular activity? Explain. How often it is occurring? Provide the server.log at the time of crash and the access log around the time of the issue
<appserver_install-directory>/domains/<domain_name>/logs/server.log <appserver_installdirectory>/nodeagents/<nodeagent_name>/<instance_name>/logs/server.log
Run the pkgapp script http://www.sun.com/bigadmin/jsp/descFile.jsp?url=descAll/sun_gdd__directory___ on this core file from the same system where core was generated and provide the generated files as listed below ./pkgapp.sh corefile casenumber_libraries.tar.gz casenumber_corefile.tar.gz Please upload the above files in https://supportuploads.sun.com/upload under the cores directory and provide us the checksum details of the uploaded files.
Recommended Data Collection for Crashes and Hangs in the Application Server ( Windows )
Enable Control-L ( View -> Lower Pane Panel ) Enable View -> Select Columns Process Memory Select All Except Private Bytes History Handle Select All File -> Save as Handle.txt DLL Select All File -> Save as Dll.txt
High CPU Application Server Process Launch Process Explorer and locate the application process that indicates high cpu ( java.exe and appserv.exe )
Page 14 Of 23
Page 15 Of 23
Page 16 Of 23
Copy the content of the stack and paste it into a file Collect the Java Thread Stack dump using the instructions below steps.pdf
Application Server Process Hangs Collection of Java Thread Stack Dump ( a few sets of data over a period of time ) 7.x and 8.x This is a third-party tool : Run the following tool from adaptj http://www.adaptj.com/main/download 9.x Run the command : asadmin generate-jvm-report Application Server Process Crash Check and sent any window dump file ( as a result of the application server crash ) Check for any hs_err_pid.log. By default, the file is created in the working directory of the process When was the issue noticed first?(Please provide the exact time stamp of the issue). Does this issue happen during high load or any particular activity? Explain. How often it is occurring? Provide the server.log at the time of crash and the access log around the time of the issue
Page 17 Of 23
<appserver_install-directory>/domains/<domain_name>/logs/server.log <appserver_installdirectory>/nodeagents/<nodeagent_name>/<instance_name>/logs/server.log
Recommended Data Collection for Crashes and Hangs in Application Server ( Linux )
Complete Application server version
<appserver_install-directory>/bin/asadmin version --verbose
Application Server Process Hangs/High CPU top -d -c -b > top.log Time stamp of the issue. server.logs that capture the problem occurrences Detail the load/number of concurrent users at that time of the problem. Does restarting the server resolve the issue? After a fresh restart of the server, for how long does it run without problems? For 7.x and 8.x Issue kill -3 command for 3 times successively with an interval of a minute. This will create the java thread dump in the server.log file. Please issue kill -3 [pid] command ONLY during the time of issue. For 9.x Please use the command : asadmin generate-jvm-report Issue the command a few times over a period of time. Please provide us the server.log containing the thread dump.
<appserver_install-directory>/domains/<domain_name>/logs/server.log <appserver_installdirectory>/nodeagents/<nodeagent_name>/<instance_name>/logs/server.log
Application Server Process Crash Check and sent any core file ( as a result of the application server crash ) Check for any hs_err_pid.log. By default, the file is created in the working directory of the process When was the issue noticed first?(Please provide the exact time stamp of the issue). Does this issue happen during high load or any particular activity? Explain more. How often it is occurring? Provide the server.log at the time of crash and the access log around the time of the issue
Page 18 Of 23
<appserver_install-directory>/domains/<domain_name>/logs/server.log <appserver_installdirectory>/nodeagents/<nodeagent_name>/<instance_name>/logs/server.log
hadbm -V habdm get --all [hadb_name] habdm status [habd_name] habdm status --nodes [habd_name] hadbm deviceinfo --details [hadb_name] hadbm resourceinfo --databuf [habd_name] HADB history files from directory /etc/system
Page 19 Of 23
WS6.1
<webserver_install-directory>/bin/https/bin/webservd -v # ./webservd -v Sun Microsystems, Inc. Sun ONE Web Server 6.1SP8 B06/13/2007 23:15
WS7.0
<webserver_install-directory>/lib/webservd -v # ./webservd -v Sun Microsystems, Inc. Sun Java System Web Server 7.0U2 B12/09/2007 09:02
Note: on the Windows platform check the error logs after Web Server startup to find the specific version (also applicable on UNIX platforms).
Follow the advice in the Sun Gathering Debug Data for Sun Java System Web Server Documentation WS 6.0, 6.1 and 7.0 Web Server basic information To Gather General Debug Data for Any Web Server Problem http://docs.sun.com/app/docs/doc/820-2483/geaaj?a=view Web Server fails to install To Gather Debug Data on Web Server Installation Problems http://docs.sun.com/app/docs/doc/820-2483/geaav?a=view Web Server fails to startup To Gather Debug Data on Web Server Startup Problem http://docs.sun.com/app/docs/doc/820-2483/geabd?a=view Web Server hangs or unresponsive To Gather Debug Data on a Hung or Unresponsive Web Server Process http://docs.sun.com/app/docs/doc/820-2483/gebbt?a=view Web Server crashes To Gather Debug Data on Web Server Crashed Process http://docs.sun.com/app/docs/doc/820-2483/geaai?a=view
Page 20 Of 23
Java Self-help
To obtain the java version, type this command:
# java -version java version "1.5.0_09" Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_09-b03) Java HotSpot(TM) Server VM (build 1.5.0_09-b03, mixed mode)
Investigating VM crashes Is it in the Native code outside the VM Is it inside the VM code Is it crashing in compiled code Reproduce it with -Xcomp option Reproduce it with -Xint option? If so, most likely it is not a compiler issue Reproduce with -client or -server if it is a compiler issue -d32 or -d64 helps you in identifying if the issue is specific to 32-bit or 64-bit architecture Run with -XX:+ShowMessageBoxOnError flag will generally suspend the VM when the error is encountered set the values of -Xms and -Xmx to be the same Debugging Application Hangs Frequency of Full GC Libthread issues Lack of LWPS Debugging Application crashes Try debugging flags to increase information output -verbose:gc, verbose:jni, -XX:+PrintCompilation Use alternative VM option to narrow down the problem area. e.g. client/server compiler, standard/incrementalGC, 64-bit/32-bit VM Signals related issues Application with native code crashes when native code installs it own signal handlers use signal chaining -XX:+UseSignalChaining Applications embedded the VM frequently need to trap signals like SIGINT or SIGTERM reduce signal usage Xrs. Caveat : shutdown hooks are not run with this option and SIGQUIT thread dumps are not available Debugging Memory issues Run application and monitor memory usage with -verbose:gc
Page 21 Of 23
always use full crash dump type for Dr. Watson. See the following sample screenshot:
For details about Dr. Watson, please refer to Description of the Dr. Watson for Windows (Drwtsn32.exe) Tool - http://support.microsoft.com/kb/308538
alternatively, you can use ADplus. Please refer to How to use ADPlus to troubleshoot "hangs" and "crashes" - http://support.microsoft.com/default.aspx?scid=kb;en-us;286350
In particular if you run the JVM as a "service", which is similar to a daemon, then ADplus is the only way. Choose these options: adplus -crash -o <outputdir> -p <pid> -quiet -NoDumpOnFirst Note: The option -NoDumpOnFirst is documented (run adplus -?). If the option is omitted, ADplus will create a minidump every minute, which will quickly fill up diskspace.
To collect user-mode dump with Windows Server 2008 and Windows Vista Service Pack 1 (SP1) - http://msdn.microsoft.com/en-us/library/bb787181(VS.85).aspx
Page 23 Of 23