You are on page 1of 5

Problem

DOCUMENTATION: How to troubleshoot NDMP Backups failures when status code 99 (NDMP backup failure) is reported.

Solution
Manual: VERITAS NetBackup (tm) 6.5 Troubleshooting Guide for UNIX, Linux and Windows Modification Type: Supplement Modification: During regular (standard) NDMP backups, avoid potential NDMP communications failures between NetBackup master or media servers and the NetApp NDMP filer host. Troubleshooting: The following procedures may help isolate the root cause of any NDMP backup issues: 1. Test the connection to the NDMP port (10000) via the telnet command. Try both the hostname and its IP address. For example, telnet ndmp_host 10000. 2. tpautoconf -verify ndmp_filer and tpautoconf -probe ndmp_host may fail. Attempt these commands from another master or media server to check another route or network path. 3. Go to the filer appliance and run ndmpd status to verify that NDMP daemon is running. If not, execute ndmpd on and verify again with ndmpd status. 4. Attempt telnet and tpautoconf command tests again from the media or master server. 5. Proceed with any new configurations or begin backups or restores. Log Files: n/a - Watch the Job Details for the job in the Activity Monitor during backup attempts.

Problem
BUG REPORT: After applying NetBackup 6.0 MP5, NDMP backups begin failing with NetBackup Status Code 27 (child process killed by signal) or Status Code 99 (NDMP backup failure)

Error
child process killed by signal

Solution
Bug: 1111632: 6.0MP5 patch appears to break NDMP backups, fail with status 27 Symptom(s): After applying the NetBackup 6.0 Maintenance Pack 5 (MP5) patch, all Network Data Management Protocol (NDMP) backups began failing with a NetBackup Status Code 27 (child process killed by signal) or Status Code 99 (NDMP backup failure). Other backups are working fine. Log Files: The bptm log file shows the following (bold added for clarity):
14:01:01.392 [5968] <2> io_write_block: ndmp_tape_write_func returned 1024 14:01:01.392 [5968] <2> write_data: completed writing backup header, start writing data when first buffer is available, copy 1 14:01:01.392 [5968] <2> NdmpSession: [1] Sending NDMP_MOVER_SET_RECORD_SIZE 36 14:01:01.392 [5968] <2> NdmpSession: [1] Reply error = 0 14:01:01.392 [5968] <2> NdmpSession: [2] Sending 19 (GET_PATH) "" 14:01:01.393 [5968] <2> NdmpSession: [2] Reply 19 "/vol/data_wh/oracle/exports",

The number following the NDMP_MOVER_SET_RECORD corresponds to the number placed in the NUMBER_DATA_BUFFERS file (found in the /usr/openv/netbackup/db/config/ directory on a UNIX/Linux media server or in the <install_path>\veritas\netbackup\db\config directory on a Windows media server). If the number following NDMP_MOVER_SET_RECORD_SIZE is something other than 64512, the site is affected. Workaround: Depending on the environment, there are two workarounds available: 1. Remove the NUMBER_DATA_BUFFERS touch file from the media server running backups for the NDMP devices. 2. Remove NetBackup 6.0 MP5 and reapply NetBackup 6.0 MP4. Please be aware option 1 can potentially introduce other problems and negatively impact NetBackup performance. Please exercise caution while testing this workaround. If the NUMBER_DATA_BUFFERS touch file is required for the environment, please contact Symantec Technical Support for additional assistance. ETA of Fix: Symantec Corporation has acknowledged that the above-mentioned issue is present in the current version(s) of the product(s) mentioned at the end of this article. Symantec Corporation is

committed to product quality and satisfied customers. This issue is currently being considered by Symantec Corporation to be addressed in a forthcoming Maintenance Pack or version of the product. The fix for this issue is expected to be released in the fourth quarter of 2007. Please note that Symantec Corporation reserves the right to remove any fix from the targeted release if it does not pass quality assurance tests or introduces new risks to overall code stability. Symantec's plans are subject to change and any action taken by you based on the above information or your reliance upon the above information is made at your own risk. Please refer to the maintenance pack readme or contact NetBackup Enterprise Support to confirm this issue (ET1111632) was included in the maintenance pack.

Problem
NDMP backup fails with Status Code 99 - DUMP: could not create "backup" snapshot : No space left on device.

Solution
======================== ISSUE: ======================== NDMP Backup fails with Status Code 99 ======================== EVIDENCE/LOGS: ======================== NDMP Daemon Debug Log from the NetApp Filer: Error code: NDMP_NO_TAPE_LOADED_ERR Device name: nrst1a Mode: 0 IOException: Device cannot be opened. Device may have no tape. NDMP message type: NDMP_CONNECT_CLOSE

======================== TROUBLESHOOTING: ======================== 1. Enable NDMP Debug Logging on the NetApp Filer by running the following commands on the Filer: ndmpd debug screen ndmpd debug 70

2. Recreate the issue by running a manual backup job. 3. The following entries may be observed in the ndmpdlog.<date> log file: Error code: NDMP_NO_TAPE_LOADED_ERR Device name: nrst1a Mode: 0 IOException: Device cannot be opened. Device may have no tape. NDMP message type: NDMP_CONNECT_CLOSE 4. To rule out the fact that the issue is on NetBackup side of things run a data dump directly to tape from the NetApp Filer. 5. Use the robtest utility from the Robot Control Host (in this scenario the Robot Control Host is the Master Server) to manually mount the tape on nrst1a. -> robtest -> select the option for the robot in question - In this case it is TLD(0) -> m s# d# -> q -> select option 2 to exit out of the robtest utility Note: m s# d# - will move tape from slot x to drive x 6. After manually mounting the media on the desired drive run the following command from the NetApp Filer to initiate a data dump directly to tape: dump 0ufb nrst1a 63 /vol/vol1 7. The following error message can be observed on the NetApp Filer: DUMP: creating "/vol/vol1/../snapshot_for_backup.1231" snapshot. DUMP: could not create "backup" snapshot : No space left on device. DUMP: Dump failed to initialize. DUMP: DUMP IS ABORTED 8. Review the available disk space on volume /vol/vol1. Note: There should be at least 20% of available disk space for volumes that are enabled for snapshots. Please contact NetApp for minimum requirements. ======================== SOLUTION/WORKAROUND: ======================== Ensure that there is at least 20% of available disk space on the volume enabled for snapshots.

You might also like