You are on page 1of 17

Problem

DOCUMENTATION: Supplemental Information to Media Manager System Administrator's Guide for Automated Cartridge System Library Software

Solution
Manual: Media Modification Manager System Administrator's Type: Guide, Page: Appendix Addition

Modification: Please review all sections before calling VERITAS support to see if this document can help you resolve your issue. If not, please place a support call and a NetBackup (tm) Technical Support Engineer will respond within a timely manner. Quick overview of acsd and acsssi processes: Unlike the tldcd design for TLD robots that requires one robotic control host, acsd is a robotic daemon that runs on each media server with Automated Cartridge System (ACS) drives attached. Robotic mount and dismount requests are sent from each media server to the Automated Cartridge System Library Software (ACSLS) server via the acsssi daemon which is an API call. Tape drive requests to read, write, position, and rewind are done through the device paths on each media server, not through the Automated Cartridge System Library Software (ACSLS) server. For a complete overview of Automated Cartridge System (ACS) and NetBackup (tm), please refer the Automated Cartridge System (ACS) section in the Media Manager System Administrator's Guide. TABLE OF CONTENTS 1. LOGGING 1.1 LOG LOCATIONS 1.1.1 Event Logs 1.2 ACS PROCESS TRACING 1.2.1 ACSSSI Tracing on the VERITAS NetBackup media server 1.2.2 ACSSS Tracing on the Library Server 2 ACS LIBRARY SERVER (ACSLS) FUNCTIONS AND COMMANDS 2.1 ACCESS CONTROL ON THE ACS LIBRARY SERVER (ACSLS) 2.2 ACSSA COMMANDS ON THE ACS LIBRARY SERVER 2.2.1 Log on to the ACSLS Server 2.2.2 Query the Library Management Unit 2.2.3 Query the Cartridge Access Ports 2.2.4 Query silos (Library Storage Modules) 2.2.5 Query Drives 2.2.6 Query Volumes

2.2.7 Command to Start Request Processing 2.2.8 Vary on LSM 2.2.9 Logoff from ACSSA (ACSLS Server interface) 2.3 ACSLS TAPE CLEANING 3 DEVICE CONFIGURATION FOR ACSLS CONTROLLED TAPE DRIVES 3.1 DEVICE CONFIGURATION 3.1.1 SSO Device Configuration for ACSLS 3.1.2 NON-SSO Device Configuration for ACSLS 3.1.3 Initial configuration of StorageTek T9940A and T9940B tape drives in an ACSLS environment 4 ROBTEST FOR ACS LIBRARIES 4.1 INVOKING ROBTEST 4.2 ROBTEST SYNTAX 4.2.1 Command to Obtain Drive Status 4.2.2 Command to Query Volumes 4.2.3 Command to Mount a Volume 4.2.4 Command to Dismount a Volume 4.3 HOW TO DEFINE ACSLS SCRATCH POOLS AND ADD VOLUMES USING ROBTEST 5 MEDIA 5.1 AVAILABLE MEDIA SCRIPT 5.2 HOW TO SPECIFY WHICH MEDIA ACCESS PORT (MAP) TO USE FOR TAPE EJECTION 6 COMMUNICATION 6.1 REMOTE PROCEDURE CALL 6.1.1 How to Start RPC on Different Operating Systems 6.1.2 How to Verify that RPC is Running 6.1.3 How to Verify the ACSSS Program Registration 6.1.4 Basic snoop Output 7 COMMON ACS ERROR MESSAGES 7.1 ACS (2) UNAVAILABLE: INITIALIZATION FAILED: UNABLE TO INITIALIZE ROBOT 7.2 ACS STATUS = 54, STATUS_IPC_FAILURE 7.3 ACS STATUS = 72, STATUS_PENDING 7.4 STATUS_NI_FAILURE 7.4.1 ACS status = 104, STATUS_NI_FAILURE 7.4.2 ACS status = 105, STATUS_NI_TIMEDOUT 1. LOGGING This section covers ACSLS log locations and ACS process tracing. 1.1 LOG LOCATIONS 1.1.1 Event Logs Event log location on NetBackup media server: /usr/openv/volmgr/debug/acsssi/event.log Event log location on ACS Library Server: /export/home/ACSSS/log/acsss_event.log (There is also an install, configuration change, and statistics log in the same directory.)

Typical event log entries are cap operations, remote procedure call (RPC) and client initialization, robot errors, NI failures, and drive status changes. 1.2 ACS PROCESS TRACING 1.2.1 ACSSSI Tracing on the VERITAS NetBackup media server 1.2.1.1 To turn on acsssi tracing on the NetBackup media server, send an SIGUSR1 signal to the acsssi process as follows: a. Ensure this directory exists:
/usr/openv/volmgr/debug/acsssi

b. Find the acsssi PID by running the command:


/usr/openv/volmgr/bin/vmps

c. Toggle on acsssi tracing:


kill -USR1 <pid of acsssi>

(This will start a trace.log in the /usr/openv/volmgr/debug/acsssi directory.) d. To turn off acsssi tracing:
kill -USR1 <pid of acsssi>

Tracing can be turned on or off multiple times using the same kill command. NOTE: To read the trace.log, it is necessary to use the StorageTek trace_decode. Please contact StorageTek for a copy and instructions for use. 1.2.2 ACSSS Tracing on the Library Server 1.2.2.1 To turn on the Library Server ability to trace the ONC RPC session and capture packets exchanged between the media server (SSI) and Library Server system (CSI): a. Run toggle with the option "on" located in /export/home/ACSSS/diag/bin:
#./toggle on

b. To turn on CSI tracing, simply send a SIGUSR1 signal to the CSI process. This can be accomplished by using the kill command as follows:
# kill -USR1 [CSI pid]

NOTE: It is necessary to first get the PID of the CSI using the ps command. Once enabled, an acsssi_trace.log will be created in the /export/home/ACSSS/log directory. The log file contains a record of all packet activity between the media server and the ACSLS server. Each packet is displayed with a time stamp, the direction of the packet, the SSI client IP address, port, identifier, and a hex dump of the contents of the packet. To read the trace.log, it is necessary to use the StorageTek trace_decode. Please contact StorageTek for a copy and instructions for use. The decoder from StorageTek will output each packet with the time stamp, the packet direction (to or from the ACSLS), the command type (i.e. start), the packet type (i.e. request, response), the number of bytes in the packet and the values of each of the fields in the CSI header and message header structures, plus any command specific parameters. For each field in the structure, the byte offset, size, and value (in hex and ASCII) are also given.

To turn off tracing, use the same method used to turn on tracing. Tracing can be toggled on/off multiple times using the same command. For Windows: Windows event logging is turned on using the mini_el and the ACSSEL function shipped with the ACS product. The packet trace is controlled using the toggle_trace script. Both tools are in the Program Files\StorageTek\LibAttach\bin directory. NOTE: Do not leave tracing on indefinitely because it may fill disk space over time. 2. ACS LIBRARY SERVER (ACSLS) FUNCTIONS AND COMMANDS This section covers ACSLS functions and commands. 2.1 ACCESS CONTROL ON THE ACS LIBRARY SERVER (ACSLS) Under ACSLS there exists the ability to control command and volume access. ACSLS uses a set of client identification files and a series of allow or disallow files to manage access control. These control files reside on the ACSLS library server in the $ACS_HOME/data/external/access_control directory. The internet.addresses file allows access control of hosts. ACSLS will compare this file against the user_id field in the received RPC request packet to determine whether to forward the packet on for further processing. A non-zero return code for this operation will result in a STATUS_INVALID_OPTION response being sent back to the media server. Volume control can be done through the set owner command in the cmd_proc utility or in the file ownership.assignment. A 'STATUS_INVALID_OPTION' status will return to commands which are rejected due to access control violations. The ACSLS access control is the preferred method for segregating volume access within a large ACSLS tape library. The previous method used INVENTORY_FILTER in the vm.conf file of the server. In the previous method, if the ACSLS server contained 50,000 tape volumes, a robot inventory would retrieve all 50,000 volumes and then compare them one at a time to the inventory filter, rejecting all tapes that were not in the correct scratch pool per the INVENTORY_FILTER directive. This could take a very long time to complete. Using the more recent access control method, the ACSLS administrator sets ownership on specific tape volumes, or ranges of tape volumes, to an owner_id. This owner_id is associated with the Netbackup server through the internet.addresses file on the ACSLS server. Only tapes that are owned by the requesting server are returned, consequently instead of retrieving 50,000 tape volume entries, Netbackup will receive only the volumes owned by the requesting server. The robot inventory therefore runs much faster. Configuring access control is outside the scope of this document and it is advised to contact Sun Support for details. If tapes are not being reported during a robot inventory that are reported via 'q vol' in cmd_proc on the ACSLS server, verify that the tapes in question have the correct ownership on the ACSLS server by running this command as the ACSSA user:
$ volrpt -d -f /export/home/ACSSS/data/external/volrpt/owner_id.volrpt

The output will contain data similar to this:


C00026 C00027 VOLUME_HOME VOLUME_HOME master1 master2

C00028

VOLUME_HOME

master1

This indicates that a robot inventory from the server known as 'master1' will return two tapes, and a robot inventory from the server known as 'master2' will return one tape. Volume access control applies to the following commands:
dismount, lock, mount_readonly, set_clean, set_scratch, eject, mount, query_volume, set_owner, unlock

For further information on access control, please contact StorageTek/Sun. 2.2 ACSSA COMMANDS ON THE ACS LIBRARY SERVER 2.2.1 Log on to the ACSLS Server
# su - acsss

At the prompt, enter:


$ cmd_proc -ql

Wait for the ACSSA> prompt. 2.2.2 Query the Library Management Unit
ACSSA> q lmu all 2004-01-28 14:24:10 ACS: 0 Mode: SCSI LMU LMU Status Master Status: Communicating Standby Status: CL Port Name - /dev/mchanger2

Port 0, 0

Port State online

Role -

2.2.3 Query the Cartridge Access Ports


ACSSA> q cap all 2004-01-28 14:25:30 CAP Status Identifier Priority Size State Mode 0, 0,0 0 10 online automatic Status available

2.2.4 Query silos (Library Storage Modules)


ACSSA> q lsm all 2004-01-28 14:26:22 LSM Status Identifier State Free Cell Audit Mount Dismount Enter Eject Count C/P C/P 0, 0 online 36 0/0 0/0

C/P 0/0

C/P 0/0

C/P 0/0

2.2.5 Query Drives


ACSSA> q drive all 2004-01-28 14:27:34 Identifier State 0, 0, 0, 0 online 0, 0, 0, 1 online 0, 0, 0, 2 online 0, 0, 0, 3 online Drive Status Status Volume available available available available Type DLT7000 DLT7000 9840 9840

2.2.6 Query Volumes


ACSSA> q volume all

2004-01-28 15:58:36 Identifier Status 000002 home 000003 home 000004 home 000005 home 000006 home 000008 home 000009 home <snip!> 2004-01-28 15:58:37 Identifier Status 000047 home 000048 home 000049 home 000050 home FX0023 home

Volume Status Current Location 0, 0, 1, 0, 0 0, 0, 0, 2, 0 0, 0, 0, 3, 0 0, 0, 1, 5, 0 0, 0, 1, 8, 1 0, 0, 0,23, 0 0, 0, 1,12, 1 Volume Status Current Location 0, 0, 1, 9, 0 0, 0, 1,10, 1 0, 0, 0, 7, 0 0, 0, 0, 4, 0 0, 0, 0, 0, 0

Type STK1R STK1R STK1R STK1R STK1R STK1R STK1R Type STK1R STK1R STK1R STK1R SDLT

2.2.7 Command to Start Request Processing


ACSSA> start Start: ACSLM Request Processing Started: Success.

2.2.8 Vary on LSM


ACSSA> vary lsm LSM identifier (acs,lsm): 0,0 LSM identifier (acs/lsm): State(diagnostic/offline/online): online 2004-03-26 11:20:53 107 LSM 0,0: online ACSSA> LSM 0,0 varied online

2.2.9 Logoff from ACSSA (ACSLS Server interface)


ACSSA> logoff

2.3 ACSLS TAPE CLEANING ACS robot types are self cleaning. Tape cleaning should not be initiated by NetBackup. If a TapeAlert-based cleaning flag is set by LTID or avrd for an ACS, TLH, or an LMF drive, the vmd/DA will not release the drives. To disable TapeAlert checking and eliminate "TapeAlert is not supported" messages in the syslog, add the NO_TAPEALERT touch file. For UNIX: /usr/openv/volmgr/database/NO_TAPEALERT For Windows: <install path>\volmgr\database\NO_TAPEALERT The StorageTek library transport control unit tracks how much tape passes through each transport and sends a message to ACSLS when a transport requires cleaning. If auto-cleaning is enabled, ACSLS automatically mounts a cleaning cartridge on the transport. If all the cleaning cartridges have expired (MAX_USAGE), ACSLS will post an error message 376N into the acsss_event log. If auto-cleaning is disabled, ACSLS logs a message in the event log and displays a message at the cmd_proc when cleaning is required. This option is enabled or disabled using the acsss_config configuration utility. This utility will allow you to specify how the cartridges are ordered for selection and queries.

NOTE: You cannot use the acsss_config configuration program to enable auto-cleaning for drives attached to a SCSI connected library storage module (LSM). For more information regarding ACSLS tape cleaning, please contact StorageTek. 3. DEVICE CONFIGURATION FOR ACSLS CONTROLLED TAPE DRIVES This section covers device configuration. 3.1 DEVICE CONFIGURATION NOTE: All Automated Cartridge System (ACS) robots configured on a media server must be configured with at least one drive, or the acsd daemon will exit, putting all Automated Cartridge System Library Software (ACSLS) drives in Automatic Volume Recognition (AVR) mode. 3.1.1 SSO Device Configuration for ACSLS During setup (in the Device Configuration Wizard), NetBackup will attempt to discover the tape drives available to it and, for robot types where serialization is available, their positions within the library. NetBackup does not yet obtain drive serial numbers from the ACS robotic library control interface, so manual configuration is required. The manual configuration cannot be avoided in a non-Shared Storage Option (non-SSO) environment, where drives are not being shared. Using NetBackup 4.5 FP6, the user can significantly reduce the amount of manual configuration required by following these steps in an SSO environment. 1. Run the device configuration wizard on just one of the hosts where drives in an ACScontrolled library are attached. Let the drives be added as standalone drives. 2. Add the ACS robot definition, and update each drive to indicate its appropriate position in the ACS robot. (Make the drive robotic, and add the ACS, LSM, Panel, and Drive information.) See the VERITAS Media Manager System Administration Guide, Configuring Storage Devices chapter, in the section "Co-relating Device Files to Physical Drives When Adding Drives." 3. Verify the drive paths, if this hasn't already been done in the previous step, based on the documentation referenced above 4. Once the drive paths have been verified on one host, re-run the device configuration wizard, and specify all hosts with ACS drives in the library to be scanned. The device configuration wizard will add the ACS robot definition and the drives to the remaining servers automatically, with correct device paths, assuming that the devices were successfully discovered, along with their serial numbers. By following the above steps, the time savings can be significant. For example, if there are 20 drives shared on 30 hosts, the above configuration steps require just 20 paths to be manually configured, instead of 600 paths. 3.1.2 Non-SSO Device Configuration for ACSLS During setup (in the Device Configuration Wizard), NetBackup will attempt to discover the tape drives available to it, and, for robot types where serialization is available, their positions

within the library. Do not use the Device Configuration Wizard. NetBackup does not obtain drive serial numbers from the ACS robotic library control interface, so manual configuration is required. 3.1.3 Initial configuration of StorageTek T9940A and T9940B tape drives in an ACSLS environment It is advised to separate the two drive types within the NetBackup Media Management device configuration to alleviate density conflicts. This issue surfaces because ACS treats the T9940A and T9940B drive media as identical, however, the T9940B version writes at a higher density therefore the T9940A drive cannot read a tape written by a T9940B drive. So, when trying to use both drives within a single library, different densities must be used for each drive. The same issue will occur with SDLT220 and SDLT320 drives in the same ACS-based library. Workaround: Add the ACS robot to the NetBackup device configuration according to the steps described within the NetBackup Media Manager Device Configuration Guide. Configure the STK 9940A drives as type hcart and configure the STK 9940B drives as type hcart2. Then define a NetBackup storage unit for each density, hcart and hcart2. Steps to inventory media for each density type: 1. In the /usr/openv/volmgr/vm.conf file on the server where vmupdate is run, add "IGNORE_WRONG_MEDIA_TYPE" to the end of the file 2. Run /usr/openv/volmgr/bin/vmupdate -rn <robot number> -rt acs -acs_stk2p hcart -- All new media found will be configured as hcart 3. Next, add the STK 9940B media to the robot 4. Finally, run /usr/openv/volmgr/bin/vmupdate -rn <robot number> -rt acs acs_stk2p hcart2 -- All new media found will be configured as hcart2. When an inventory of ACS robotics is done, NetBackup will receive a vendor media type back as well as the barcode. That vendor media type is mapped to one of the NetBackup media types. The tag "IGNORE_WRONG_MEDIA_TYPE" allows NetBackup to map a single ACS vendor media type to multiple NetBackup media types. Disadvantages: 1. If for any reason, media is ejected from the library, verification is required when reinjecting media that it goes to the correct media type (hcart for 9940A, hcart2 for 9940B). 2. T9940B drives cannot be used to read the 9940A media; they have to be segregated. 4 ROBTEST FOR ACS LIBRARIES This section covers acstest (robtest) 4.1 INVOKING ROBTEST
# /usr/open/volmgr/bin/robtest

Configured robots with local control supporting test utilities:


ACS(0) ACSLS host = taco

Robot Selection --------------1) ACS 0 2) none/quit Enter choice: 1 Robot selected: ACS(0) ACSLS host = taco, SSI socket = 13741 Invoking robotic test utility: /usr/openv/volmgr/bin/acstest -r taco -s 13741 -d /dev/rmt/1cbn 0,3,1,0Server 0 with 24 free cells is in state "STATE_RUN" QUERY SERVER complete Enter acs commands (? returns help information)

4.2 ROBTEST SYNTAX


? To exit the utility, type q or Q. cancel <request_id> defpool <pool> <lwm> <hwm> <attrib> delpool <pool> dm <vol> [<drive>|<drive_id>] [f] drstat <drive_id> eject <cap_id> <vol_list> specified CAP enter <cap_id> capstat [<cap_id>] varycap <cap_id> online|offline setmode <cap_id> automatic|manual setpriority <cap_id> <priority> m <vol> [<drive>|<drive_id>] qmmi qpool [<pool>] qreq [<request_id>] qscr [<pool>] qserver qvol [<vol>] setscr <pool> ON|OFF <vol> [<vol>] range start RUN state) types Cancel server request Define scratch pool Delete empty scratch pool Dismount volume (optionally forced) Print drive status Eject a list of volumes to the Enter volumes in the specified CAP Print CAP status set the state of the given CAP set the mode of the given CAP set the priority of the given CAP Mount volume Query actual mixed media information Query pools Query server requests Query scratch volumes by pool Query ACSLS server Query volumes Set scratch attributes for volume

- Start ACS Library Manager (request - Print list of known ACS media types

SCSI commands:
unload <drive>|<drive_id> - Issue SCSI unload where: <acs>=0-126, <lsm>=0-23, <cap>=0-2, <drive>=0-15, <priority>=0-127 <drive_id> = [<acs>,<lsm>,<panel>,<drive>] <drive> = d1 if drive 1, d2 if drive 2, ..., d15 if drive 15 <lwm> = scratch pool low water mark <hwm> = scratch pool high water mark <cap_id> = <acs>,<lsm>,<cap> <vol_list> = <vol1>[:<vol2>:...:<vol42>]

4.2.1 Command to Obtain Drive Status


drstat Drive 1 information: ID (acs,lsm,panel,drv): drive type: volume ID: state: status: Drive 2 information: ID (acs,lsm,panel,drv): drive type: volume ID: state: status: Drive 3 information: ID (acs,lsm,panel,drv): drive type: volume ID: state: status: Drive 4 information: ID (acs,lsm,panel,drv): drive type: volume ID: state: status: DRIVE STATUS complete

0,0,0,0 DLT7000 <none> STATE_ONLINE STATUS_DRIVE_AVAILABLE 0,0,0,1 DLT7000 <none> STATE_ONLINE STATUS_DRIVE_AVAILABLE 0,0,0,2 9840 <none> STATE_ONLINE STATUS_DRIVE_AVAILABLE 0,0,0,3 9840 <none> STATE_ONLINE STATUS_DRIVE_AVAILABLE

4.2.2 Command to Query Volumes


qvol 000002 STK1R home 000003 STK1R home 000004 STK1R home 000005 STK1R home 000006 STK1R home 000008 STK1R home QUERY VOLUME complete 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0 2, 0 3, 0 5, 0 8, 1 23, 0

4.2.3 Command to Mount a Volume


m 000040 0,0,0,2 MOUNT complete

4.2.4 Command to Dismount a Volume


dm 000040 0,0,0,2 f DISMOUNT complete

4.3 HOW TO DEFINE ACSLS SCRATCH POOLS AND ADD VOLUMES USING ROBTEST Start the robtest utility: On UNIX:
# /usr/openv/volmgr/bin/robtest

On Windows:
<install_path>veritas\volmgr\bin\robtest.exe

Select the ACS robot. Enter the 'define pool' command as follows:
defpool 4 0 500 1 Scratch pool 4 has been defined

NOTE: DEFINE POOL completes robot inventories for ACS, where 4 is the pool number where 0 500 is the low and high water marks where 1 is overflow on (could be 0 for overflow off) Next, define ACSLS scratch volumes in this pool:
qpool 4 Pool ID 4 has 0 volumes QUERY POOL complete setscr 4 ON 000040 000044 000040 STATUS_SUCCESS 000041 STATUS_SUCCESS 000042 STATUS_SUCCESS 000043 STATUS_SUCCESS 000044 STATUS_SUCCESS SET SCRATCH complete qpool 4 Pool ID 4 has 5 volumes QUERY POOL complete

Quit 'robtest' and perform a normal robot inventory with NetBackup. 5 MEDIA This section covers "available_media" script output and tape ejection. 5.1 AVAILABLE MEDIA SCRIPT The utility /usr/openv/netbackup/bin/goodies/available_media does not report robot slot numbers for ACS libraries. This is not a bug. The information on slot location is managed by ACS and not by Media Manager. Below is a sample output from the available_media report, which can be generated by running the following command: Windows: <install_path>\netbackup\bin\goodies\available_media or UNIX: /usr/openv/netbackup/bin/goodies/available_media
media media robot robot robot side/ ret size status ID type type # slot face level Kbytes --------------------------------------------------------NetBackup pool NB0001 DLT ACS 0 0 2848 ACTIVE NB0002 DLT ACS 0 0 2848 ACTIVE ABC234 DLT ACS 0 AVAILABLE

ABC345

DLT

ACS

AVAILABLE

5.2 HOW TO SPECIFY WHICH MEDIA ACCESS PORT (MAP) TO USE FOR TAPE EJECTION In the media server's /usr/openv/volmgr/vm.conf file, it is possible to specify the media access port (MAP) to use when ejecting media to a particular ACS robot. If this entry is present, NetBackup (including the Vault extension) will eject to the specified MAP instead of the default 0,0,0 MAP. The vm.conf entry syntax:
MAP_ID = robot-num map-id

Example: If a user wants the ACS(0) robot to eject via its 0,0,1 MAP and the ACS(1) robot to eject via its 0,1,0 MAP, the following vm.conf entries would be necessary on the media servers that use these robots:
MAP_ID = 0 0,0,1 MAP_ID = 1 0,1,0

6. COMMUNICATION This section covers RPC and communication. 6.1 REMOTE PROCEDURE CALL (RPC) NetBackup uses RPC to connect to the ACSLS server, and rpcbind is the service that converts RPC program numbers into universal addresses. It must be running on the host to be able to make RPC calls on a server on that machine. If this service is not running this could cause the root file system to fill up. NetBackup could send data to the links in /dev and since they cannot connect to the drives the data will spool until the system runs out of disk space. Since the links in /dev have no size it will be difficult to determine what is using up all the disk space on the file system. 6.1.1 How to start RPC on different operating systems Starting RPC is best accomplished using the operating system vendor startup scripts. The following are the ways of starting RPC on various operating systems. a. Solaris i. Solaris (7,8,9)
# # # # # # # /etc/init.d/rpc start svcadm enable svc:/network/rpc/bind:default svcadm restart svc:/network/rpc/bind:default /sbin/init.d/Rpcd start startsrc -s portmap /etc/rc.d/init.d/portmap start /usr/sbin/portmap

ii. Solaris 10 b. HP-UX c. AIX d. Linux e. Tru/64 f. Windows Click Start | Settings | Control Panel | Administrative Tools | Services. Select Remote Procedure Call (RPC) and click Start.

6.1.2 How to Verify that RPC is Running The following commands will verify that the rpcbind is active and that the RPC service is functioning between the media server and ACSLS Library Server. From a terminal window on the media server, issue the following command to ACSLS.
# rpcinfo program version netid 100000 4 ticots 100000 3 ticots 100000 4 ticotsord 100000 3 ticotsord 100000 4 ticlts 100000 3 ticlts 100000 4 tcp 100000 3 tcp 100000 2 tcp 100000 4 udp 100000 3 udp 100000 2 udp address hotdog.rpc hotdog.rpc hotdog.rpc hotdog.rpc hotdog.rpc hotdog.rpc 0.0.0.0.0.111 0.0.0.0.0.111 0.0.0.0.0.111 0.0.0.0.0.111 0.0.0.0.0.111 0.0.0.0.0.111 service rpcbind rpcbind rpcbind rpcbind rpcbind rpcbind rpcbind rpcbind rpcbind rpcbind rpcbind rpcbind owner superuser superuser superuser superuser superuser superuser superuser superuser superuser superuser superuser superuser

If the service is not running, rpcinfo will report: "can't contact rpcbind: RPC: rpcbind failure - RPC: Failed ( unspecified error )" Examine the /export/home/ACSSS/log/acsss_event.log for "RPC: Rpcbind failure." The error message should include an IP that it is trying to communicate with. Verify it is the correct IP address. 6.1.3 How to Verify the ACSSS Program Registration
#rpcinfo -t {acsls_hostname} 300031 2 program 300031 version 2 ready and waiting #rpcinfo -t {acsls_hostname} 300031 1 program 300031 version 1 ready and waiting

You should get a response from both programs, but you only need a response from the version that you are using (2 = TCP or 1 = UDP). The NetBackup default for this communications service is UDP. 6.1.4 Basic snoop Output Basic snoop of a query server sent automatically by initiating the robtest utility:
# snoop salad carter 1 0.00000 salad -> carter.min.veritas.com prog=300031 (?) vers=1 proto=UDP 2 0.00108 carter.min.veritas.com -> salad port=1025 3 0.00056 salad -> carter.min.veritas.com PROG=300031 (?) VERS=1 PROC=1000 4 0.00091 carter.min.veritas.com -> salad XID=1066379121 Success 5 0.00459 carter.min.veritas.com -> salad PROG=1073741824 (transient) VERS=1 PROC=1000 6 0.00031 salad -> carter.min.veritas.com XID=1066114434 Success 7 0.15656 carter.min.veritas.com -> salad PROG=1073741824 (transient) VERS=1 PROC=1000 8 0.00029 salad -> carter.min.veritas.com XID=1065841469 Success PORTMAP C GETPORT PORTMAP R GETPORT RPC C XID=1066379121 RPC R (#3) RPC C XID=1066114434 RPC R (#5) RPC C XID=1065841469 RPC R (#7)

This trace shows that the portmapper and program registration was successful. For additional assistance with NetBackup options as they pertain to ACSLS, reference the Media Manager System Administrator's Guide. 7 COMMON ACS ERROR MESSAGES This section covers common ACSLS error messages. 7.1 ACS(2) UNAVAILABLE: INITIALIZATION FAILED: UNABLE TO INITIALIZE ROBOT Resolution: Verify the IP address specified for the robotic host is correct in LibAttach on the master server. 7.2 ACS STATUS = 54, STATUS_IPC_FAILURE Robtest will show:
acs_query_server() failed Unable to query server taco, ACS status = 54, STATUS_IPC_FAILURE Robotic test utility /usr/openv/volmgr/bin/acstest returned abnormal exit status (1). STATUS_PENDING

Cause: local network interface is down. 7.3 ACS STATUS = 72, STATUS_PENDING Example 1: robtest will show:
acs_response() failed Unable to obtain Query Server acknowledge response, ACS status = 72, STATUS_PENDING Robotic test utility /usr/openv/volmgr/bin/acstest returned abnormal exit status (1).

Media Server event.log will log:


02-05-04 13:49:18 SSI[0]: ONC RPC: csi_rpccall(): status:STATUS_NI_FAILURE; failed: clntudp_create() RPC UDP client connection failed, RPC: Rpcbind failure Remote Internet address:10.82.56.67, Port: 0

Cause: ACSLS server network interface is down Example 2: Media server system log :
Nov 15 11:18:06 hoehpt07 acsd[8807]: ACS(0) Response has not been returned by Mount command sequence 4434, ACS status = 72, STATUS_PENDING Nov 15 11:28:06 hoehpt07 acsd[8807]: ACS(0) Response has not been returned by Mount command sequence 4434, ACS status = 72, STATUS_PENDING

Event log from the ACSLS :


2003-05-22 11:33:51 CSI[0]:

1022 N csi_net_send.c 1 474 ONC RPC: csi_net_send(): status:STATUS_NI_TIMEDOUT; failed: st_net_send() Cannot send message to NI:discarded, Network timeout Errno = 0 (none) Remote Internet address: 207.169.154.59, Port: 53429 2003-05-22 11:33:51 CSI[0]: 1026 N csi_freeqmem.c 1 142 ONC RPC: csi_freeqmem(): status:STATUS_QUEUE_FAILURE; Dropping from Queue: Remote Internet address: 207.169.154.59,Port: 53429 , ssi_identifier: 1, Protocol: 2, Connect type: 1 2003-05-22 11:33:51 ACSSA[0]: 1432 N sa_demux.c 1 273 Server System network interface timeout.

Cause: The errors above indicate that NetBackup is able to reach the ACSLS server with a status request, but the ACSLS server is unable to respond. By default, the IP address sent to the ACSLS as part of the packet STATUS request is that of the primary hostname of the media server, i.e. the hostname given by uname -a. This error occurs when the ACSLS cannot resolve reverse name or is configured (via routing) to only reach a secondary interface on the media server. If the issue is failure to do reserve name lookup, add the media server's IP to the domain name service (DNS) reverse tables or to the ACSLS /etc/hosts file. If the ACSLS server cannot route to the media server's hostname, override the default behavior by using "ACS_SSI_HOSTNAME = <hostname>" in the /usr/openv/volmgr/vm.conf file where the value for <hostname> is a hostname associated with an IP address the ACSLS can reach on the media server. 7.4 STATUS_NI_FAILURE Explanation of message: STATUS_NI_TIMEDOUT: The CSI (media server) has timed out waiting for a response from a client (ACSLS). The actual "Waiting to obtain XXXXX ACS sequence YYY acknowledge response" indicates that the daemon has not received an acknowledgment from the LibraryStation module for 30 seconds (which is a hard-coded limit) following a sent command. "Unable to obtain" means it has given up. The timeouts seen occur after a 30 second delay, when no acknowledgment has been received. These timeout periods are determined by two tunable environment variables: CSI_RETRY_TIMEOUT - The default for which is 3 seconds, not 2 as described in the Media Management manual. CSI_RETRY_TRIES - The default for this is 5 retries. Changes: Add the following to /usr/openv/volmgr/vm.conf on the media server. CSI_RETRY_TIMEOUT=30 CSI_RETRY_TRIES=10

or Change the OS environment variable CSI_RETRY_TIMEOUT to 30 and CSI_RETRY_TRIES to 10. From bash/ksh prompt: #CSI_RETRY_TIMEOUT=30;export CSI_RETRY_TIMEOUT #CSI_RETRY_TRIES=10;export CSI_RETRY_TRIES Add to the NetBackup startup script for a permanent solution. Isolating the problem: a. Ensure that the media server can successfully ping the ACS server and vice-versa b. Check that ltid has started the acsd, acsssi and assel service daemons on the VERITAS media server by performing a bpps -a from /usr/openv/netbackup/bin or vmps from /usr/openv/volmgr/bin/volmgr. If acsssi is not running, ensure that RPC is running c. Run snoop between the media server and ACS. Initiate robtest and check that the query server, sent upon robtest initialization, is responded to by the ACSLS server (reference basic snoop below) d. Verify the RPC communications between the media server and ACSLS host using the rpcinfo command. The rpcinfo -t <acs_host> 300031 1 h command checks RPC connectivity, portmapper registration and that the ACSLS program (service) is available (reference 'How to start RPC' above). e. Check the event logs from the media server and library server for errors (reference Logging above) f. Check the syslog or event file for errors g. Enable tracing (reference Logging above) h. This error can occur if the users.ALL.allow file on the ACSLS Library Server does not contain an entry for the requesting media server. This file is found on the ACSLS server in the $ACS_HOME/data/external/access_control directory and is used for granting/denying library access. For example, entries of the users.ALL.allow consult the ACSLS administrators guide. 7.4.1 ACS status=104, STATUS_NI_FAILURE robtest will show:
acs_response() failed Unable to obtain Query Server acknowledge response, ACS status = 104, STATUS_NI_FAILURE Robotic test utility /usr/openv/volmgr/bin/acstest returned abnormal exit status (1).

acsssi event.log:
02-05-04 13:31:54 SSI[0]: ONC RPC: csi_rpccall(): status:STATUS_NI_FAILURE; failed: clntudp_create()

RPC UDP client connection failed, RPC: Program not registered Remote Internet address:10.82.56.67, Port: 0;

Cause: ACSLS is down or not responding 7.4.2 ACS status = 105, STATUS_NI_TIMEDOUT Example of the message log:
Aug 26 17:06:57 sdhra1a acsd[15914]: ACS(0) Unable to obtain Query ACS sequence 3420 acknowledge response, ACS status = 105, STATUS_NI_TIMEDOUT Aug 26 17:06:57 sdhra1a acsd[5476]: DecodeDismount(): ACS(0) driveid 0,0,10,5, Actual status: Unable to initialize robot Aug 26 17:06:57 sdhra1a acsd[5476]: ACS(0) going to DOWN state, status: Unable to initialize robot Aug 26 17:07:15 sdhra1a acsd[15910]: ACS(0) Waiting to obtain Query Drive sequence 3381 acknowledge response, ACS status = 72, STATUS_PENDING

Cause: The CSI (media server) has timed out waiting for a response from a client (ACSLS). Windows NT and Win2000: STK LibAttach for Windows is required for ACSLS use. The current version of STK LibAttach for Windows only allows for a single instance of ACSLS, which means you can only enter one IP address for the Library Server. Multiple ACS off the same Library Server is allowed. The VERITAS robot definition must match the address specified in LibAttach.

You might also like