Professional Documents
Culture Documents
January 8, 2010
TSB 2010-071-A
P RO D U CT S AF FE CTE D:
BULLETIN OVERVIEW
The SYSMON feature can be used to help troubleshoot packet loss issues. The system
monitoring service (SYSMON) feature monitors the hardware in the system to detect, report, and
in some cases isolate and recover hardware errors in the system. When an error or event occurs,
SYSMON generates two types of Syslog messages, INFO and ALARM. Report all ALARM SYSMON
messages to Brocade Technical Support.
Brocade produces and publishes Technical Support Bulletins to OEMs, partners and customers that
have a direct, entitled, support relationship in place with Brocade.
Please contact your primary service provider for further information regarding this topic and
applicability for your environment
TSB 2010-071-A
PROBLEM STATEMENT
To help troubleshoot packet loss, packet forwarding, or protocol flap issues on the BigIron RX, the
SYSMON feature was created. The SYSMON feature helps to resolve these issues when they are
hardware related.
In the 02.7.02b patch release, the following SYSMON enhancements were added:
1. A read and write test was added to detect failed Switch Fabric Module (SFM) FE failures.
A Syslog message is generated to indicate a SFM FE failure.
2. The performance of ingress DRAM CRC detection was improved. This detection now has
two methods to detect errors; an interrupt routine is used to detect these errors quickly
and will trigger a shutdown of the failed TM. Long-term polling has been added to detect
low rate CRC errors that will be reported with a Syslog message.
3. TM blocked (stuck) buffers can now be detect and recovered from egress. This process
generates a Syslog message.
RISK ASSESSMENT
As a result of the extended monitoring enhancements, any marginal hardware may be reported
as defective. For any ALARM, log messages please call Support for further instructions.
Brocade Technical Support may guide you in using SYSMON commands to obtain additional
information. Do not use these commands without guidance from technical support.
SYMPTOMS
The table shown below provides examples of Syslog message generated by SYSMON and the
events that trigger them.
Syslog message examples
Event
Description
Dec 29
15:31:24:W:System:ALARM:LP15/TM3
has 6 links, less than the minimum to
maintain line rate
Traffic Manager
(TM) Link:
Dec 29 16:25:24:E:System:ALARM:TM
Clock Drift: LP15/TM4 (Reg:
0xbe20,Value: 0x2b013ffc)
TM clock
synchronization
Dec 29
17:19:26:E:System:ALARM:LP15/TM1
has shutdown(TM Internal Error:
LP15/TM1(Reg: 0x444, Value:
0x70000)(shutdown))
TM registers
TSB 2010-071-A
2 of 3
Event
Description
Jan 8
11:48:29:E:System:ALARM:LP3/TM1
has shutdown(TM Q Stuck: LP3/TM1
(Queue1024, Size 50331648,
Credit2047, RdPtr 0x0)(shutdown))] Jan
8 11:48:19:E:System:ALARM:LP3/TM1
re-initialized(TM Q Stuck: LP3/TM1
(Queue1024, Size 50331648,
Credit2047, RdPtr 0x0) (QDPreinit)) Jan
8 11:48:19:E:System:ALARM:LP3/TM1
QDP re-initialized (TM Q Stuck:LP3/TM1
(Queue 1024, Size50331648, Credit
2047, RdPtr0x0) (QDP reinit))
Dec 31
13:51:42:E:System:ALARM:LP1/TM1
has shutdown(LP1/NP1 TCAM
failure(shutdown))
TCAM scan
Jan 8
08:59:29:E:System:ALARM:LP3/NP2
packet pathdiagnostic failure
Sep 13 15:01:29:E:System:ALARM:FE
Read-Write Test Error: SNM4/FE1 Reg
0x14,Read 0x48000000 != Written 0x0
Switch fabric
element
read/write error
Sep 13 15:01:29:E:System:ALARM:
LP9/TM2 has shutdown(TM DRAM CRC:
LP9/TM2 (Reg:0xa50c, Value:
0x7)(shutdown))
TM ingress DRAM
CRC error
Sep 13 15:01:29:E:System:ALARM:
TM egress buffer
LP16/TM1 re-initialized (TM TX
stuck error
BUFFERStuck: LP16/TM1
(Reg:0x400000b4, Value: 0x1de)(reinit))
WORKAROUND
The default behavior for TM register monitor action has been changed. The shutdown action is
now enabled by default. Enter no sysmon tm reg shutdown command to disable this action.
CORRECTIVE ACTION
Brocade recommends that customers use SYSMON, as it is configured by default in code 2.7.02b
or higher. This will provide the best system monitoring protection. For any ALARM, log
messages please call Support for further instructions.
TSB 2010-071-A
3 of 3