You are on page 1of 52

WebSphere Application Server V6

High Availability Manager Overview

David Currie, IT Specialist


EMEA Software Lab Services

© 2006 IBM Corporation


WebSphere Application Server V6

Agenda

 Introduction
 Core groups
 Core group bridge service
 High availability groups and policies
 Transaction service
 WebSphere XD partitioning facility
 Service Integration Bus
 Questions

2 High Availability Manager Overview © 2003 IBM Corporation


WebSphere Application Server V6

What is the high availability manager?

 Component of WebSphere Application Server V6 providing


high availability services to other WAS components
 Runs in every WAS process
 Provides three key capabilities:
– High availability of singleton services
– Bulletin board for exchanging state data between processes
– Data Replication Service (DRS) memory-to-memory replication

3 High Availability Manager Overview © 2003 IBM Corporation


WebSphere Application Server V6

Agenda

 Introduction
 Core groups
 Core group bridge service
 High availability groups and policies
 Transaction service
 WebSphere XD partitioning facility
 Service Integration Bus
 Questions

4 High Availability Manager Overview © 2003 IBM Corporation


WebSphere Application Server V6

Core groups

 Set of processes representing a high availability domain in


which failover and replication take place
 Each process must be a member of just one group
 All cluster members must belong to the same group
 Each group must contain at least one node agent or
deployment manager
 Processes are added to DefaultCoreGroup when created
 Members of a group must have full IP visibility and
bidirectional communication to all other members

5 High Availability Manager Overview © 2003 IBM Corporation


WebSphere Application Server V6

Core group coordinator

 Every core group has a coordinator that manages the


failover of highly available singleton services and distributes
state data to interested members
 Coordinator uses CPU and heap to perform these tasks
 Coordinator election occurs whenever the view changes (a
member stops or starts) and this consumes resources
 Specify stable servers with spare capacity as preferred
coordinator servers
 Multiple coordinators may be necessary if resource usage is
excessive

6 High Availability Manager Overview © 2003 IBM Corporation


WebSphere Application Server V6

Core group transport options

 Channel framework transport


– Default is to use distribution and consistency services (DCS)
transport chain
– DCS_SECURE transport chain uses SSL for encryption
– LTPA token can be used to authenticate incoming requests
 Unicast transport
– Standard network connection avoiding channel framework
overhead
– LTPA token for authentication but no SSL
 Multicast transport
– Uses UDP but still requires TCP/IP for failure detection
protocol

7 High Availability Manager Overview © 2003 IBM Corporation


WebSphere Application Server V6

Core group discovery protocol

 Establishes network connectivity with other


members of the core group on startup
 Retries unavailable connections periodically
– Default is every 30s (WAS 6.0.2) or 15s (WAS 6.0/6.0.1)
– Configurable via core group custom property
• IBM_CS_UNICAST_DISCOVERY_INTERVAL_SECS
 On successful connection DCSV1032I event logged
and view synchrony protocol starts

8 High Availability Manager Overview © 2003 IBM Corporation


WebSphere Application Server V6

Core group failure detection protocol

 Monitors the connections that the discovery protocol


establishes
 Failure of core group member detected by:
– Inbound or outbound sockets closing
– Active heart beating configure via core group custom
properties
• IBM_CS_FD_PERIOD_SECS (default 30s)
• IBM_CS_FD_CONSECUTIVE_MISSED (default 6)
 Failed members are report to the discovery protocol and
view synchrony protocol

9 High Availability Manager Overview © 2003 IBM Corporation


WebSphere Application Server V6

Core group view synchrony protocol

 A view consists of a set of connected core group members


communicating over this protocol
 View change occurs when a member is discovered or fails
– Activities relating to the old view must complete and as a result
temporary spikes in CPU and network usage may occur
– One of core group members is elected to send its current
configuration to all the other members
– Inconsistencies in HA policy or coordinator configuration are
tolerated but those in core group membership are not

10 High Availability Manager Overview © 2003 IBM Corporation


WebSphere Application Server V6

Core group scaling


 System resource usage does not scale linearly with core group size
 Resource requirements of view synchrony protocol dependent on:
– Number of applications running
– Type of applications running
– High availability manager services that are used
 View changes use a lot of system resource:
– Each member communicates its state to other members
– All messages sent or received must be acknowledged
– May cause changes in routing table or singleton services
 For core groups that are too large degenerate network timing conditions occur which may cause the
installation of the new view to fail, recovery from this is CPU intensive which may result in paging
causing further failures
 Mx(N-M) discovery messages each period (M=running member, N=group size)
 Mx(M-1) heartbeating messages
 Amount of CPU utilization by other components e.g. WLM and ODR is also linked to core group size
 Recommendation is for maximum 50 members per core group

11 High Availability Manager Overview © 2003 IBM Corporation


WebSphere Application Server V6

Core group configuration

 Core group configuration is stored at cell scope


– List of core group members
– HA policies for the core group
– Core group coordinator configuration
– Core group transport configuration
 Each process also has its own configuration
– Whether HA manager is enabled
– Transport buffer size
– Name of the core group to which server belongs
– How frequently the HA manager checks the health of singletons

12 High Availability Manager Overview © 2003 IBM Corporation


WebSphere Application Server V6

Why have multiple core groups in a cell?

 One or more firewalls within a cell – a core group


can not contain members from multiple firewall
protection domains
 Resource usage increases exponentially with core
group size

13 High Availability Manager Overview © 2003 IBM Corporation


WebSphere Application Server V6

Agenda

 Introduction
 Core groups
 Core group bridge service
 High availability groups and policies
 Transaction service
 WebSphere XD partitioning facility
 Service Integration Bus
 Questions

14 High Availability Manager Overview © 2003 IBM Corporation


WebSphere Application Server V6

Core group bridge service

 Core groups may need to communicate to share availability


information
 Access point is a collection of core groups that
communicate
 Each access point has one or more bridge interfaces defined
as a node, server and transport chain
 A server hosting a bridge interface is a core group bridge
server
 If communicating with a core group in another cell a peer
access point is defined which may specify one or more peer
ports or, if not directly accessible, a peer proxy port

15 High Availability Manager Overview © 2003 IBM Corporation


WebSphere Application Server V6

Communication between core groups in a cell

16 High Availability Manager Overview © 2003 IBM Corporation


WebSphere Application Server V6

Communication between core groups across cells

17 High Availability Manager Overview © 2003 IBM Corporation


WebSphere Application Server V6

Communication between core groups across cells

18 High Availability Manager Overview © 2003 IBM Corporation


WebSphere Application Server V6

Communication between core groups across cells

19 High Availability Manager Overview © 2003 IBM Corporation


WebSphere Application Server V6

Communication between core groups across cells

20 High Availability Manager Overview © 2003 IBM Corporation


WebSphere Application Server V6

Communication between core groups across networks

21 High Availability Manager Overview © 2003 IBM Corporation


WebSphere Application Server V6

Communication between core groups across networks

22 High Availability Manager Overview © 2003 IBM Corporation


WebSphere Application Server V6

Communication between core groups across networks

23 High Availability Manager Overview © 2003 IBM Corporation


WebSphere Application Server V6

Core group bridge service custom properties

 Custom properties
– CGB_ENABLE_602_FEATURES enables core group
bridge servers to be added without restarting other
servers and allows discovery of peer ports
– FW_PASSIVE_MEMBER can be used if a firewall is
configured to listen only
– IBM_CS_LS_DATASTACK_MEG can be used to
increase the data stack size if you are seeing warning
DCSV2005W

24 High Availability Manager Overview © 2003 IBM Corporation


WebSphere Application Server V6

Agenda

 Introduction
 Core groups
 Core group bridge service
 High availability groups and policies
 Transaction service
 WebSphere XD partitioning facility
 Service Integration Bus
 Questions

25 High Availability Manager Overview © 2003 IBM Corporation


WebSphere Application Server V6

High availability groups

 Created dynamically when an application server component


requests to join a group
 Other instances of the component across multiple
processes may join the group
 HA group has a name made up of name value pairs
– Company=IBM,ComponentName=TM,policy=DefaultNoQuorumOneOfNPolicy

 Each group member is either idle, active or disabled


 Scoped by a core group

26 High Availability Manager Overview © 2003 IBM Corporation


WebSphere Application Server V6

High availability group policies

 Statically defined HA policies govern which members of HA


group are active
 Policy selected for HA group based on having the most
name-value pairs matching the group name
 At least one policy must match and there must be only one
policy with the most number of matches
 Policy rules applied when
– Member joins or leaves HA group
– State of a member changes e.g. from idle to disabled
 Policy changes are picked up dynamically

27 High Availability Manager Overview © 2003 IBM Corporation


WebSphere Application Server V6

High availability policy settings

 Policy type  Is alive timer used to


– All active determine failure of process or
component
– M of N active
 Quorum setting only activates
– No operation members once majority are
– One of N available
– Static  Dynamic policy information
may also be contained in
 Preferred servers (ordered list)
group name e.g. GN_PS
 Fail back – always active on contains the names of the
most preferred server available preferred servers
 Preferred servers only

28 High Availability Manager Overview © 2003 IBM Corporation


WebSphere Application Server V6

Policy modification

 Do not delete the IBM provided policies


 Do not try to change the type of an existing policy
 Ensure that the components that a policy will
match support that policy type
 Ensure that match criteria are not ambiguous by
adding criteria or deleting unwanted policies

29 High Availability Manager Overview © 2003 IBM Corporation


WebSphere Application Server V6

Viewing HA group information

 The current HA groups can be viewed on the Runtime tab of a core


group
 Show servers will then displays the HA groups for each server
 Shows groups displays the HA groups for the entire group
 Viewing group shows members and their current status

30 High Availability Manager Overview © 2003 IBM Corporation


WebSphere Application Server V6

Disabling HA manager

 CPU, heap and socket usage increases


exponentially with core group size
 HA manager can be disabled on a per-process
basis if its functionality is not required
 Do not disable on HA manager on deployment
manager or node agent unless disabled on all
servers in that core group

31 High Availability Manager Overview © 2003 IBM Corporation


WebSphere Application Server V6

Agenda

 Introduction
 Core groups
 Core group bridge service
 High availability groups and policies
 Transaction service
 WebSphere XD partitioning facility
 Service Integration Bus
 Questions

32 High Availability Manager Overview © 2003 IBM Corporation


WebSphere Application Server V6

Transaction service high availability


Cluster
server1

Txn Log

server2

Txn Log

server3

Txn Log

33 High Availability Manager Overview © 2003 IBM Corporation


WebSphere Application Server V6

Transaction service default policy and HA groups

 Default Clustered TM Policy is One of N policy with match


criteria of type=WAS_TRANSACTIONS and failback enabled
 By default each server joins just one group containing its
name as a dynamic preferred server

34 High Availability Manager Overview © 2003 IBM Corporation


WebSphere Application Server V6

Enabling transaction service high availability

 Select the “High availability for


persistent services” option on
the cluster definition
 Configure recovery log
location for transaction service
on each server with path
accessible from every server
 Each server will then also join
the group for every other
server in the cluster

35 High Availability Manager Overview © 2003 IBM Corporation


WebSphere Application Server V6

File locking

 By default file locking is used


to prevent recovery in event of
system overload or network
partitioning
 NFSv3 does not release the file
locks held by a failed host
 If these conditions can be
avoided then disable file
locking

36 High Availability Manager Overview © 2003 IBM Corporation


WebSphere Application Server V6

Manual peer recovery

 If using NFSv3 and system


overload or network
partitioning may occur
consider manual peer recovery
 Define a static HA policy for
each server in the cluster with
matching criteria for
type=WAS_TRANSACTIONS
and
GN_PS=cellname\nodename\s
ervername
 Add the corresponding server
to the static group
 On failure, add the peer server
to the static group

37 High Availability Manager Overview © 2003 IBM Corporation


WebSphere Application Server V6

Agenda

 Introduction
 Core groups
 Core group bridge service
 High availability groups and policies
 Transaction service
 WebSphere XD partitioning facility
 Service Integration Bus
 Questions

38 High Availability Manager Overview © 2003 IBM Corporation


WebSphere Application Server V6

WebSphere XD partitioning facility

 Enables incoming requests to be distributed across


application servers dependent on the content of the request
 Data accessed by requests can then be distributed across
those servers and held in memory

cell
server1
partition1

ODR or partition2
EJB stub
server2
partition3
partition4

39 High Availability Manager Overview © 2003 IBM Corporation


WebSphere Application Server V6

WPF HA groups and policies

40 High Availability Manager Overview © 2003 IBM Corporation


WebSphere Application Server V6

Agenda

 Introduction
 Core groups
 Core group bridge service
 High availability groups and policies
 Transaction service
 WebSphere XD partitioning facility
 Service Integration Bus
 Questions

41 High Availability Manager Overview © 2003 IBM Corporation


WebSphere Application Server V6

Service Integration Bus high availability

 Adding a cluster as a member of a service integration bus


results in the creation of a single messaging engine
 The messaging engine will start in the first available server
and, if that becomes unavailable, failover to another

cluster
server1 server2
ME0
Q1

42 High Availability Manager Overview © 2003 IBM Corporation


WebSphere Application Server V6

Service Integration Bus high availability

 Adding a cluster as a member of a service integration bus


results in the creation of a single messaging engine
 The messaging engine will start in the first available server
and, if that becomes unavailable, failover to another

cluster
server1 server2
ME0
Q1

43 High Availability Manager Overview © 2003 IBM Corporation


WebSphere Application Server V6

Service Integration Bus high availability

 Adding a cluster as a member of a service integration bus


results in the creation of a single messaging engine
 The messaging engine will start in the first available server
and, if that becomes unavailable, failover to another

cluster
server2
BANG ME0
Q1 Q1

44 High Availability Manager Overview © 2003 IBM Corporation


WebSphere Application Server V6

SIB default policy and HA groups

 Default SIB Policy is One of N policy with match criteria of


type=WSAF_SIB
 Each server joins a group containing the name of the messaging
engine
 HA manager activates the messaging engine on the first available
server

45 High Availability Manager Overview © 2003 IBM Corporation


WebSphere Application Server V6

Service Integration Bus workload management

 Additional messaging engines can be added to provide


workload management capabilities
 Destinations associated with the cluster bus member are
partitioned across the messaging engines and messages
distributed across those partitions
 HA policies must be configured else all the messaging
engines will start on the first available server!

cluster
server1 server2
ME0 ME1
Q1 Q1

46 High Availability Manager Overview © 2003 IBM Corporation


WebSphere Application Server V6

SIB HA policies for workload management

 If failover isn’t required, create a static policy for each messaging


engine tying it to a particular server
 If failover is required, create a One of N policy for each messaging
engine specifying a preferred server and failback
 Policy must match messaging engine name and type to avoid
ambiguity with default policy

47 High Availability Manager Overview © 2003 IBM Corporation


WebSphere Application Server V6

SIB HA policies for large clusters

 Specify an ordered list of


preferred servers for each
messaging engine rotating
across a subset of the
cluster members
 Select Preferred servers
only

48 High Availability Manager Overview © 2003 IBM Corporation


WebSphere Application Server V6

49 High Availability Manager Overview © 2003 IBM Corporation


WebSphere Application Server V6

Summary

 Core groups are static collections of communicating servers


 Bridge servers can be used to share information between core
groups
 HA groups are created dynamically by WebSphere components
 HA policies define where those components should be active
 Examples of where you may come in to contact with HA manager
include:
– Transaction Service
– WebSphere XD partitioning facility
– Service Integration Bus

50 High Availability Manager Overview © 2003 IBM Corporation


WebSphere Application Server V6

References
 Transactional High Availability and Deployment Considerations in WAS V6
– http://www.ibm.com/developerworks/websphere/techjournal/0504_beaven/0504_beaven.html
 WebSphere Application Server V6 System Management and Configuration Handbook
– http://www.redbooks.ibm.com/abstracts/sg246451.html
 WebSphere Application Server Network Deployment V6: High availability solutions
– http://www.redbooks.ibm.com/abstracts/sg246688.html
 WebSphere Application Server V6 Scalability and Performance Handbook
– http://www.redbooks.ibm.com/abstracts/sg246392.html
 WAS V6 InfoCenter: High availability and workload sharing
– http://publib.boulder.ibm.com/infocenter/wasinfo/v6r0/topic/com.ibm.websphere.pmc.nd.doc/tasks/tjt9999_.html
 WAS V6 InfoCenter: Setting up a high availability environment
– http://publib.boulder.ibm.com/infocenter/wasinfo/v6r0/topic/com.ibm.websphere.nd.doc/info/ae/ae/trun_ha_environment.html
 WebSphere XD InfoCenter: HA manager and the partitioning facility
– http://publib.boulder.ibm.com/infocenter/wxdinfo/v6r0/topic/com.ibm.websphere.xd.doc/info/WPF51/cwpfha_pdf.html

51 High Availability Manager Overview © 2003 IBM Corporation


WebSphere Application Server V6

52 High Availability Manager Overview © 2003 IBM Corporation

You might also like