Ha Overview

POWERHA Implementation Overview
Andrew Lanczi
Certified Consulting I/T Specialist
alanczi@us.ibm.com
© IBM Corporation 2008

Objectives
zUnderstand the Implementation of HACMP
9Planning
9How HACMP works
9Configuration options
9HACMP/XD

Over 60,000 Licenses Worldwide
Europe Africa Asia/Pacific:

The Americas:
HACMP Customers
HACMP Customers
Police & Fire Fleet Services
Air Reservations Telco Billings
Publishing Security Services
Manufacturing Plant Floor
Banking ATM's PCB Process Mfg.
Trading Systems Telco Directory
Trading Systems Fleet Management
Credit Verification Entertainment
Process Control Cellular Phone Srvc
File Servers NIC Servers
Air Traffic Credit Processing
Claims Processing Financial
Bond Trading Health & Hospital
Inventory Management Retail ISP's
HACMP Version Summary
Version Availability End of Support
V4.5 June, 2002 Sept, 2005
V5.1 July, 2003 Sept, 2006
V5.2 July, 2004 Sept, 2007
V5.3 July, 2005 Sept, 2008
V5.4 July, 2006 Sept, 2009 (est)
Green – supported; Red – no longer supported

HACMP V5.4 – Key Features
Faster Failure Detection using First Failure Data Capture (FFDC)
HACMP on Linux (for Power)
Performance improvements to HACMP/XD GLVM

Can have up to 4 data mirroring networks
Support for Enhanced Concurrent Volume Groups
GPFS 2.3 Support
IPAT support on Geographic Networks (for HACMP/XD)

HACMP Cluster
Network(s)
Node A Node B
Client
Shared Disk
zA typical cluster consists of nodes, networks, shared storage,

and clients
ƒ HACMP supports 2 - 32 nodes per cluster
Hardware
pSeries Servers/ POWER

No Integrated serial ports for heartbeat on P5 servers
Announcement letter for limitations - #5765-F62
Network/SAN
2-port Async. RS-232 - FC 5723
2 Gb FC PCI-X - FC 5716
10 Gb FC 5718 and FC 5719
IBM, Cisco, McData, Brocade etc..
TotalStorage SAN Volume Controller Software

Additional Hardware
TotalStorage Products
DS8000 - online firmware update is now supported (code level
6.0.0.324 R10h.9b050406) and higher
DS6000 - online firmware update is not supported in HACMP
clusters
DS4100 and DS4000 EXP1000 Serial ATA Hardware
DS4100 does not support multi-path I/O so no multi path fallover
CSPOC cannot be used to add a DS4100 disk to AIX
Totalstorage DS4000 EXP710 FC Storage Expansion unit (1740-710)
TotalStorage ESS (2105-F20, 2105-800)
OEM per CSA, EMC, HDS...

Online Planning Worksheets (OLPW)
Stand alone tools for planning a cluster

Can be used to configure a cluster with cl_opsconfig utility
Does not monitor or manage a cluster
Useable on AIX or Windows 2000
Installation Requirements
Java Runtime Environment version 1.3.0 or higher
AIX already has this level jre
windows needs it
9www.ibm.com/developerworks/webservices/sdk

ASCII Based Cluster Configuration
Well defined and documented XML file structure using

Document Type Definition (DTD) and XML schema
Configuration file can be passed to XML editors

Allows customers to modify cluster configs that can then be
passed to multiple clusters
cl_exportdefinition - worksheet from existing cluster
Extermal DTD
/usr/es/sbin/cluster/worksheets/hacmp-v5300.dtd
External XML Schema
/usr/es/sbin/cluster/worksheets/hacmp-v5300.xsd

ASCII Based Cluster Configuration
Duplicate a cluster with changes
Export Definition
file for OLPW
XML file XML Editor
HACMP Cluster
Updated
XML file
cl_opsconfig
New HACMP Cluster

Web-based Cluster Management-webSMIT
The Main page after login

Web-based Cluster Management-webSMIT
Requirements
Any "Apache-Compliant" web Server
IBM HTTP Server
Apache
/usr/es/sbin/cluster/wsm/README tells
how to install Apache from RPMs
Fileset cluster.es.client.wsm
(optionally)Documentation filesets:
cluster.doc.en_US.es.pdf
cluster.doc.en_US.es.html

HACMP Components
RSCT - (Reliable Scalable Cluster Technology)

RMC subsystem
Cluster Manager - (clstrmgr)

Recovery driver
SNMP Services
clcomd - cluster communications daemon
clinfo - (cluster information services, optional)

How HACMP Works
Heartbeat is used to monitor health
IP Network
RS-232
Disk Disk

Networks for an HACMP cluster
At least two networks are recommended

One physical network with multiple logical IP subnets
One non IP based network
rs-232
disk heartbeat
The goal is to avoid a partitioned cluster
both nodes always get the latest information
Decide on the mechanism to provide availability of Service
addresses
IPAT via Replacement
IPAT via Aliasing - The default
Other requirements
persistent IP addresses
Which IPAT?
"IPAT via Aliasing": IP address takeover performed by moving

an IP alias address from one interface to another , without
changing the base address of the interface
ƒIP aliasing allows multiple resource groups to be configured
using the same adapters
ƒIP aliased networks use boot and service labels
Boot label on standby adapters as well
No Hardware Address Takeover on Service IP
"IPAT via IP Replacement": IP address takeover performed

by swapping an interfaces boot-time address with a service IP
address
ƒOnly one address can be active on an interface at any time

Network IPAT Connection options
IPAT via Aliasing
a_boot1 10.10.20.1 b_boot3 10.10.20.10
a_svc 192.37.56.1
a_boot2 10.10.30.1 b_boot4 10.10.30.2
IPAT via Replacement
a_boot 192.37.56.10 b_svc 192.37.56.20
a_svc 192.37.56.1
a_standby 10.10.20.1 b_standby 10.10.20.2
Ethernet2
ETHERNET1
subnet mask
255.255.255.0
client
sysa © IBM Corporation 2008 sysb

Networking with switches
One Layer 3 VLAN with multiple logical subnets
Do not place intelligent

network equipment that
does not transparently
pass through UDP
broadcasts and other
packets to all cluster
nodes. If such equipment
is placed in the paths
between cluster nodes
and clients, use a
$PING_CLIENT_LIST (in
clinfo.rc © IBM Corporation 2008
CISCO Example with HACMP
Assume that the customer is using the standard Cisco Switch product line of 3550, 3750, 49xx, 6500, etc.
At the L3 level, one vlan is all that is needed to satisfy the HACMP setup. You just define multiple
IP addresses as secondary addresses on this vlan interface for the multiple subnets. For example: the
three subnets are 1.1.1.0/24, 1.1.2.0/24, and 1.1.3.0/24 as primary, standby, alias HA networks
respectively all using vlan 50. On a Cisco L3 switch/router using IOS, you would code the following:
Switch# config t
Switch(config)# vlan 50
Switch(config-vlan)# name HACMP_Setup
Switch(config-vlan)# exit
Switch(config)# int vlan 50
Switch(config-if)# ip address 1.1.1.254 255.255.255.0
Switch(config-if)# ip address 1.1.2.254 255.255.255.0 secondary ( Alias IP address net)
Switch(config-if)# ip address 1.1.3.254 255.255.255.0 secondary ( Standby IP address net)
Switch(config-if)# no shut
Switch(config-if)# exit
Switch(config)# exit
Switch#
You now have vlan 50 customized with three different ip address identities (one for each of the
subnets), and all of them pingable. Alias and standby/boot are tagged as secondary.
Persistant Labels
An IP alias that is always available if a service or boot

interface is active
Intended to provide administrators access to a node
Only one persistent label per node per network is

allowed
Once synchronized, they are always available

Can be used for HATivoli oserv process IP

Heartbeat Over RS232
A point to point non IP serial network

usually implemented using Async adapter and a null
modem cable connection
On some pSeries servers that have 3 or 4 built in serial
ports, ports 2, 3 or 4 can be used for this connection
Built in serial port 1 is not supported for HACMP
In LPAR mode it is better to allocate a PCI slot for the
async adapter on each server is using rs-232 serial
network
Check documentation to make sure the port is
supported, prior to implementation, p5 server = no!

Heartbeat Over Disk (diskhb)
Provides users with:

A point to point network type that is easy to configure as a
volume group
Additional protection against cluster partitioning
A Serial network that can use any disk type
Doesn't require additional hardware.
For customers that consider rs232, tmssa, or tmscsi too costly or

complex to setup
Requires an enhanced concurrent VG
Configured via the "Extended Configuration path"
Uses a disk sector formerly reserved for clvmd
May not be a good alternative for a disk with heavy I/O

Heartbeat Over Disk (diskhb)
Any disk in an enhanced concurrent VG can be used

point to point networks
disknet1 enhconcvg disknet1

hdisk1
disknet2 hdisk2 disknet3
hdisk3
disknet2 disknet3

Cluster Communication Daemon
clcomd
provides a secure transport layer
caches ODM's for performance
/var/hacmp/odmcache - about 1MB per node
managed by SRC and started by init, the inittab entry is:
clcomdES:2:once:startsrc -s clcomdES > /dev/console 2>&1
Source addresses are checked against

/usr/sbin/cluster/etc/rhosts
HACMPadapter ODM
HACMPnode ODM (communication paths)

Cluster Communication Daemon
Security Strategies
The default is autodiscovery
AIX Cluster security - CtSec
Use a VPN tunnel
Set up persistent IP labels on the same subnet
chgsrc to add the -p to clcomd
specify port 6191 (clcomd entry in /etc/services)
use the extended VPN configuration screen to secure traffic for other
cluster services
If there is an unresolvable label in /usr/es/sbin/cluster/etc/rhosts, all
connections will be denied
Log Files
– /var/hacmp/clcomd/clcomd.log[.0]- up to 1 MB each
– /var/hacmp/clcomd/clcomddiag.log[.0]- up to 9 MB each

LVM and Disks
Use mirrored logical volumes

Including mirrored jfslogs
Consider quorum issues
Use mutiple connections from the servers to the disk subsystem(s)
DBVG
jfsloglv
dblv
db2lv
DBVG'
jfsloglv'
dblv'
db2lv'

Fast Disk Takeover
Requires Enhanced Concurrent Volume Groups

Provides a significant performance gain for takeover of volume
groups consisting of a large number of disks
requires Enhanced Concurrent Volume Groups in
non-concurrent resource groups
Uses RSCT for communication
HACMP coordinates activity between nodes - active vs passive
varyon etc.
bos.clvm.enh
If migrating shared VGs must be converted
System Management (CSPOC) - recommended
or chvg -C on ALL cluster nodes

Cross Site LVM mirroring
Management feature to simplify the configuration of LVM mirroring

between two sites.
Provides automatic LVM mirror synchronization after disk failure when a
node/disk becomes available in a SAN network.
Maintains data integrity by eliminating manual LVM operations.
Cluster verification enhancements to ensure the data high availability.
Keeping the data in different locations eliminates the possibility of data loss
upon disk block failure situation.
For high availability each mirror copy should be located on separate physical
disk, in separate disk enclosure, at separate physical locations.
LVM mirroring allows up to three data copies.
Mirror synchronization is required for stale partitions.

Cross-Site LVM Mirroring
Two sites connected using a SAN network
Site A
Site B Node C
Node A
PV1 PV3
FC Switch 1 FC Switch 2
PV6
PV5
PV2 PV4
Node D
Node B
VIO Server
SAN Storage
Subsystem
AIX1 AIX2
AIX1VG VIO Server Partition Partition
AIX2VG
Ethernet
HYPERVISOR
hdisk2
hdisk1
VIOS owns physical disk resources
- LVM based storage on VIO Server
LPAR’S sees disks as vSCSI (Virtual SCSI) devices

- Virtual SCSI devices added to partition via HMC
- LUNs on VIOS accessed as vSCSI disk
VIO Server Implementation
Single VIO Server configuration has

exposures
The VIO Server partition is shutdown or fails
Network connectivity through the VIO
Server
Disk Failure
System failure

High Availability with Dual VIO Servers
External Servers External Servers
VIO AIX1 AIX2 VIO

Server Server
vSCSI vSCSI
vLAN vLAN
Shared Virtual Ethernet Switch Shared

Ethernet Ethernet
Adapter POWER Hypervisor Adapter
VLAN 1
VLAN 2
IEEE VLANS
- Up to 4096 VLANS
Virtual Ethernet - Up to 65533 vENET adapters
- Partition to partition communication
- Requires AIX 5L V5.3 and POWER5 - 21 VLANS per vENET adapter
VLAN – Virtual LAN Shared Ethernet Adapter

- Provide ability for adapter to be on multiple subnets - Provides access to outside world
- Provide isolation of communication to VLAN members - Uses Physical Adapter in the Virtual I/O Server
- Allows a single adapter to support multiple subnets
VIO Server with HACMP Cluster
SAN Storage
Subsystem
HACMP HACMP
AIX1VG VIO Server AIX1 AIX2
AIX2VG Partition Partition
Ethernet HYPERVISOR
hdisk2
hdisk1
Issues:
Network Connectivity?
Shared Disk Access?
SPOFs
Available via Advance POWER Virtualization
Custom Resource Groups
A collection of resources is a resource group. Resources can be:

Applications
Volume Groups, Disks, Filesystems
IP Addresses

Users explicitly specify the desired startup , fallover, and fallback
behaviors
Can be configured using standard or extended path
Settling and Fallback timers provide further granularity
Dynamic node priority can provide even further granularity in multi-node
cluster.

Startup Preferences
Online On Home Node Only - (OHN)
Online on First Available Node - (OFAN)
Online Using Distribution Policy - (OUDP)
Online On All Available Nodes (concurrent) - (OAAN)
Fallover Preferences
Fallover To Next Priority Node In The List - (FNPN)
Fallover Using Dynamic Node Priority - (FUDNP)
Bring Offline (On Error Node Only) - (BO)
This is most appropriate for concurrent type RGs
Fallback Preferences
Fallback To Higher Priority Node - (FHPN)
Never Fallback - (NFB)
Resource Distribution Policies
Control of IP via aliasing service labels

Collocation - all resources of this type will be on the same physical
resource
Anti-collocation - all resources of this type are allocated to the first
physical resource that is not already serving a resource - default
HACMP Extended Resources Configuration
Configure Resource Distribution Policies
Configure Service IP Labels/Address Distribution Preference
Type or select Values in entry fields.

Press Enter AFTER making all desired changes
[Entry Fields]
*Network Name net_ether_01

*Distribution Preference Anti-Collocation +

Resource Distribution Policies
All policies are exercised by cluster event scripts

acquire_service_addr
acquire_takeover_addr
cl_configure_persistent_address
collocation with persistent
anti-collocation with persistent
Feature is available in all versions of HACMP V5

HA 5.1 requires APAR IY63515
HA 5.2 requires APAR IY63516

Dependent Resource Groups
Used for multi-tiered architectures that require ordered resource

group processing
Allows the implementer to specify cluster wide dependencies
between resouces groups
parent - child - Dependency type
Option to display dependancies
clRGinfo -a
Resource Group A
(child resource group)
Dependency
Resource Group B Resource Group C

(parent resource group) (parent resource group)

Custom Resources
Three Node Cluster with one resource group configured for Online
on Home Node Only priority at startup
Sysa is the current owner of the resource group
GROUPA
a_svc 1.1.1.1
dbvg
dbapp
a_stdby a_svc b_svc b_stdby c_svc c_stdby
dbvg
sysa sysb sysc

Custom Resources after fallover
Sysa has crashed!

Fallover To Next Priority Node In The List
If sysb was not available, sysc would have acquired GROUPA
GROUPA
a_svc 1.1.1.1
dbvg
dbapp
a_stdby b_svc a_svc c_svc c_stdby
dbvg
sysa sysb
sysc

Custom Resources - reintegration
Sysa is repaired and hacmp is restarted on sysa

ƒ Fallback To Higher Priority Node
GROUPA
a_svc 1.1.1.1
dbvg
dbapp
b_svc b_stdby c_svc c_stdby

a_stdby a_svc
dbvg
sysa sysc
sysb

Custom Resources after fallover
Sysa has crashed!

ƒFallover Using Dynamic Node Priority
Destination determined by DNP rules - lowest CPU usage
GROUPA
a_svc 1.1.1.1
dbvg
dbapp
a_stdby b_svc b_boot c_svc c_boot
dbvg
sysa sysb sysc

Online on All Nodes
Up to 32 nodes access the data simultaneously

All systems have the resource group
All systems read and write to the database
LCMP
GROUPA
dbvg
ORAC
dbvg
sysa sysb sysc

Configuration Settling Time
Implementers can configure a cluster settling time to minimize RG fallback

activity when multiple nodes are started at the same time
controls how long to wait for a higher priority node to join the cluster
before bringing a resource group online
One time per cluster - preference = Online On First Available Node
Can be found under "Configure Resource Group Run-Time Policies" in

SMIT
Configure Settling Time for Resource Groups
Type or select values in entry fields.

Press Enter AFTER making all desired changes.
[Entry Fields]
* Settling Time (in Seconds ) [0] #

Configuration Fallback Timers

Fallback Timers allow the implementer to control
when a RG fallback will occur - off peak, weekends, etc..
Can only be configured in the extended path
Configure Specific Date Fallback Timer Policy

[Entry Fields]
* Name of the Fallback Policy [ ]
* YEAR [ ] #
* MONTH (jan - Dec) [ ] +
* Day of Month (1 - 31) [ ] +#
* HOUR (0 - 23) [ ] +#
* MINUTES (0 - 59) [ ] +#
Application Servers - Standard Path
An Application Server is a label with an associated start and stop

script
Start/Stop = Absolute path to the executable scripts
Add an Application Server

[Entry Fields]
* Server Name [ myapp ]
* Start Script [/usr/local/app/start_app]
* Stop Script [/usr/local/app/stop_app]
F1=Help F2=Refresh F3=Cancel F4=List

F5=Reset F6=Command F7=Edit F8=Image
F9=Shell F10=Exit Enter=Do

Application Monitoring
HACMP supports multiple monitors per application server

ƒConfigured via SMIT - Extended Configuration
#smitty hacmp
Extended Configuration
Extended Resource Configuration
HACMP Extended Resources Configuration
Configure HACMP Applications
Add an Application Server
Add Application Server

[Entry Fields]
* Server Name [ appsrv ]
* Start Script [ /app/startserver ]
* Stop Script [ /app/stopserver ]
Application Monitor Name(s) monitor1 monitor2 +

Add a Process Application Monitor

[Entry Fields]
* Monitor Name []
* Application Server(s) to Monitor +
* Monitor Mode [Long-running monitori> +
* Processes to Monitor []
* Process Owner []
Instance Count [] #
* Stabilization Interval [] #
* Restart Count [] #
Restart Interval [] #
* Action on Application Failure [notify] +
Notify Method []
Cleanup Method []
Restart Method []

User Interface
SMIT flow in HACMP

"Standard" configuration path allows users to easily configure
most common options
IPAT via Aliasing Networks
Shared service IP labels
Volume Groups and Filesystems
Application Servers
easy as cake
"Extended" path is used for fine tuning a configuration and

configuring less common features
Configure all network types
Configure all resource types
Less common options
9Site support
9Application Monitoring
9Performance Tuning Parameters
User Interface
Topology configuration in the "Standard" path is carried out automatically

Configuration discovery is automatic
Node names are set by discovering the host names
IP network topology is set based on physical connectivity and
netmasks
Why Do I Need the "Extended Path"?

Specify sites, global networks, specific network attributes
Tape resources
Custom disk methods and resource recovery
Extended Event Configuration
Extended Performance Tuning
Security and Users
Snapshot Configuration
User Interface
SMITTY HACMP
HACMP for AIX
Move cursor to desired item and press Enter.
Initialization and Standard Configuration

Extended Configuration
System Management (C-SPOC)
Problem Determination Tools
F1=Help F2=Refresh F3=Cancel F8=Image


Standard Path Installation
Standard Configuration Menu
Initialization and Standard Configuration
Two-Node Cluster Configuration Assistant

Add Nodes to an HACMP Cluster
Configure Resources to Make Highly Available
Configure HACMP Resource Groups
Verify and Synchronize HACMP Configuration
HACMP Cluster Test Tool
Display HACMP Configuration
F1=Help F2=Refresh F3=Cancel F8=Image


Two-Node Configuration Assistant
An advanced automation infrastructure for On-Demand

operating environments
intended for pre-existing application environments that wish to
add high availability
Automatically configures a simple two node cluster based on

the following input:
Communication path to the remote node
application server name
application start/stop scripts
service IP label
Will automatically copy the start/stop scripts to the remote node

Two Node configuration Assistant
Users must configure the topology and resources At the AIX level
before using the Coonfiguration Assistant
Before you start, complete the following tasks:
Connect and configure all IP network interfaces.
Install and configure the application to be made highly available.
Add the application's service IP label to /etc/hosts on all nodes.
Configure the volume groups that contain the application's shared
data on disks that are attached to both nodes.
An active communication path to the takeover node.
A unique name to identify the application to be made highly available.
The full path to the application's start and stop scripts.
The application's service IP label.

# smitty hacmp
Installation and standard configuration

[Entry Fields]
* Communication Path to Takeover Node [] +
* Application Server Name []
* Application Server Start Script []
* Application Server Stop Script []
* Service IP Label [] +

Two-Node Configuration Assistant
Will create a cluster with the following characteristics

IPAT via IP aliasing
Local node is the highest priority
Startup on highest priority node
Fallover to the remote node
Never Fallback
Contains one application server
Contains one service IP label
Contains all shareable Volume Groups
Activity is logged to /var/hacmp/log/clconfigassist
Will synch and verify - clverify

can be set to auto-correct- default is no

Cluster Start
# smitty clstart
System Management - (C-SPOC)
Manage HACMP Services
Start Cluster Services
Start Cluster Services

[Entry Fields]
* Start now, on system restart or both [both]
Start Cluster Services on these nodes []
BROADCAST message at startup? true +
Startup Cluster Information Daemon true +
Reacquire after forced down false +
Ignore verification errors? false
Automatically correct errors found during Interactively
Cluster start?
clverify logfiles
clverify collects and archives the data

ƒ/var/hacmp/clverify/current/ - stores data used during the current
verification attempt. This should not exist unless verification is running or
was aborted
ƒ/var/hacmp/clverify/aborted/ - stores data from the most recent aborted

verification attempt
ƒ/var/hacmp/clverify/fail/ - stores data from the most recent failed

verification attempt
ƒ/var/hacmp/clverify/pass/ - stores data from the most recent passed

verification attempt.
ƒ/var/hacmp/clverify/pass.prev/ - stores data from the second most

recent passed attempt © IBM Corporation 2008
Standard Path Installation
Standard Topology Configuration

Users must specify communication path, IP address, IP label, or FQDN
HACMP will contact the nodes using the specified comm paths and
automatically configure the base IP topology
Configure Nodes to an HACMP Cluster (standard)

[Entry Fields]
* Cluster Name [andrews_cluster]
New Nodes (via selected communication paths) [node1 node2] +
Currently Configured Node(s)

Installation Standard Path
Standard Resource Configuration

Users may only configure the most common resource types
NOTE: Service IP Labels/Addresses are now configured as resources
Configuring a Service label is required for the "Standard Path"
Configure Resources to Make Highly Available
Configure Service IP Labels/Addresses

Configure Application Servers
Configure Volume Groups, Logical Volumes, and Filesystems
Configure Concurrent Volume Groups and Logical Volumes

Cluster Test Tool
Simplifies cluster validation

Automates testing of an HACMP cluster
–Tests are carried out in sequence and analyzed by the cluster
Log file /var/hacmp/log/cl_testtool.log
Custom Test plans can be created
Cluster test tool runs the following tests by default

NODE_UP -Start one or more nodes
NODE_DOWN_FORCED - Stop a node forced
NODE_DOWN_GRACEFUL - Stop one or more nodes
NODE_DOWN_TAKEOVER - Stop a node with takeover
CLSTRMGR_KILL - catastrophic failure
NETWORK_DOWN_LOCAL - Stop a network on a node
NETWORK_UP_LOCAL - Restart a network on a node
SERVER_DOWN - Stop an application server
SMS Text Messaging - HACMP
Allows alerts of cluster events to be sent to

cell phones and pagers
Easily customizable using SMIT
Messages may be sent through an SMS
gateway
1-555-444-9999@sms.verizon.com
andrew_cell@cingular.com
GSM Modem (Global System for Mobil Comm)
A wireless modem that connects to a cellular network
allowing a computer to connect to the net wirelessly

SMS Text Messaging - HACMP
A ";" in the number will result in an alpha numeric page being

sent - 18005552222;437-1881
The @ character will send an SMS message using /usr/bin/mail
The "#" will cause an SMS message to be sent wirelessly via
GSM modem - 437-1881#
Add a Custom Remote Notification Method
Type or select Values in entry fields.

Press Enter AFTER making all desired changes
[Entry Fields]
* Method Name [SMS_Notify]

Description [ Node Down ]
* Nodename(s) [ NodeA] +
* Number to dial or cell phone address [ 6034442222@sms.verizon.net]
* Filename [ /usr/es/sbin/cluster/samples/pager/sample.txt ]
•Cluster Event (s) node_down +
Smart Assists
WebSphere 6.0 standalone and ND

N+1 and hot standby
Oracle "cold failover cluster" (CFC)

Oracle app server 10g(9.0.4) (AS 10g)
Two node - hot standby
DB2 - UDB Enterprise Server Edition (v8.1 and 8.2)

N+1 and hot standby
DB2 software must not be installed on the share storage

Uses Site Support for PPRC
Provides support for ESS 2105-F20 and 2105-800
HACMP/XD for eRCMF (Enterprise Remote Copy Management

Facility)
Requires ESS eRCMF version 2.0
Uses Site support for GLVM supports cross site data replication
with no distance limitation
Synchronous
A maximum of 2 sites

HACMP XD:PPRC Support
Peer-to peer Remote Copy

HACMP coordinates Sharks to ensure failover of the
environment
WAN
Site 1 Site 2
ESS PPRC
Shark 2
Shark 1 Hardware Based Data Mirroring

Two Site HACMP Cluster With Geographic LVM
TCP/IP WAN
Boston Austin
PV1 = hdisk7 PV3 = hdisk9 PV1 = hdisk5 PV3 = hdisk7

PV2 = hdisk8 PV4= hdisk10 PV2 = hdisk6 PV4 = hdisk8
Node A Node B
PV1 PV2 PV3 PV4 PV1 PV2 PV3 PV4
Real Copy #1 Virtual Copy #2 Virtual Copy #1 Real Copy #2

One volume group actually spans both sites. Each site contains a copy of mission-critical data. Instead of extremely long disk
cables, a TCP/IP network and the RPV device driver are used for remote disk access.

More Information
HACMP System Administration l: Planning and Implementation

HACMP System Administration ll: Administration and Problem
Determination
HACMP System Administration III: Virtualization and Disaster Recovery
HACMP Problem Determination and Recovery
HACMP Certification Workshop (2 days)
AM050 HACMP High Availability Products for pSeries Overview (2 days)
hafeedbk@us.ibm.com - Comments and Questions about HACMP

www-1.ibm.com/servers/eserver/pseries/ha
http://www.ibm.com/servers/aix/library
http://www-1.ibm.com/servers/eserver/pseries/library/hacmp_docs.html

Ha Overview

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Ha Overview

Uploaded by

Copyright:

Available Formats

POWERHA Implementation Overview

© IBM Corporation 2008

zUnderstand the Implementation of HACMP

9How HACMP works

© IBM Corporation 2008

Europe Africa Asia/Pacific:

Version Availability End of Support

V4.5 June, 2002 Sept, 2005

V5.1 July, 2003 Sept, 2006

V5.2 July, 2004 Sept, 2007

V5.3 July, 2005 Sept, 2008

V5.4 July, 2006 Sept, 2009 (est)

Green – supported; Red – no longer supported

Faster Failure Detection using First Failure Data Capture (FFDC)

HACMP on Linux (for Power)

Performance improvements to HACMP/XD GLVM

GPFS 2.3 Support

IPAT support on Geographic Networks (for HACMP/XD)

© IBM Corporation 2008

zA typical cluster consists of nodes, networks, shared storage,

pSeries Servers/ POWER

© IBM Corporation 2008

© IBM Corporation 2008

Stand alone tools for planning a cluster

© IBM Corporation 2008

Well defined and documented XML file structure using

Configuration file can be passed to XML editors

© IBM Corporation 2008

Duplicate a cluster with changes

XML file XML Editor

New HACMP Cluster

© IBM Corporation 2008

The Main page after login

© IBM Corporation 2008

© IBM Corporation 2008

RSCT - (Reliable Scalable Cluster Technology)

Cluster Manager - (clstrmgr)

clcomd - cluster communications daemon

clinfo - (cluster information services, optional)

© IBM Corporation 2008

Heartbeat is used to monitor health

© IBM Corporation 2008

At least two networks are recommended

"IPAT via Aliasing": IP address takeover performed by moving

"IPAT via IP Replacement": IP address takeover performed

© IBM Corporation 2008

sysa © IBM Corporation 2008 sysb

One Layer 3 VLAN with multiple logical subnets

Do not place intelligent

An IP alias that is always available if a service or boot

Intended to provide administrators access to a node

Only one persistent label per node per network is

Once synchronized, they are always available

© IBM Corporation 2008

A point to point non IP serial network

© IBM Corporation 2008

Provides users with:

For customers that consider rs232, tmssa, or tmscsi too costly or

May not be a good alternative for a disk with heavy I/O

Any disk in an enhanced concurrent VG can be used

disknet1 enhconcvg disknet1

© IBM Corporation 2008

Source addresses are checked against