RHE Linux User Manual

Daftar Isi ☛

(Sebelumnya) 12 : HA Resource Parameters - ...

Cluster Administration

HA Resource Behavior

This appendix describes common behavior of HA resources. It is meant to provide ancillary information that may be helpful in configuring HA services. You can configure the parameters with luci or by editing /etc/cluster/cluster.conf. For descriptions of HA resource parameters, refer to Appendix B, HA Resource Parameters. To understand resource agents in more detail you can view them in /usr/share/cluster of any cluster node.

Note

To fully comprehend the information in this appendix, you may require detailed understanding of resource agents and the cluster configuration file, /etc/cluster/cluster.conf.

An HA service is a group of cluster resources configured into a coherent entity that provides specialized services to clients. An HA service is represented as a resource tree in the cluster configuration file, /etc/cluster/cluster.conf (in each cluster node). In the cluster configuration file, each resource tree is an XML representation that specifies each resource, its attributes, and its relationship among other resources in the resource tree (parent, child, and sibling relationships).

Note

Because an HA service consists of resources organized into a hierarchical tree, a service is sometimes referred to as a resource tree or resource group. Both phrases are synonymous with HA service.

At the root of each resource tree is a special type of resource - a service resource. Other types of resources comprise the rest of a service, determining its characteristics. Configuring an HA service consists of creating a service resource, creating subordinate cluster resources, and organizing them into a coherent entity that conforms to hierarchical restrictions of the service.

This appendix consists of the following sections:

Note

The sections that follow present examples from the cluster configuration file, /etc/cluster/cluster.conf, for illustration purposes only.

C.1. Parent, Child, and Sibling Relationships Among Resources

A cluster service is an integrated entity that runs under the control of rgmanager. All resources in a service run on the same node. From the perspective of rgmanager, a cluster service is one entity that can be started, stopped, or relocated. Within a cluster service, however, the hierarchy of the resources determines the order in which each resource is started and stopped.The hierarchical levels consist of parent, child, and sibling.

Example C.1, "Resource Hierarchy of Service foo" shows a sample resource tree of the service foo. In the example, the relationships among the resources are as follows:

fs:myfs (<fs name="myfs" ...>) and ip:10.1.1.2 (<ip address="10.1.1.2 .../>) are siblings.
fs:myfs (<fs name="myfs" ...>) is the parent of script:script_child (<script name="script_child"/>).
script:script_child (<script name="script_child"/>) is the child of fs:myfs (<fs name="myfs" ...>).

Example C.1. Resource Hierarchy of Service foo

<service name="foo" ...> <fs name="myfs" ...> <script name="script_child"/> </fs> <ip address="10.1.1.2" .../></service>

The following rules apply to parent/child relationships in a resource tree:

Parents are started before children.
Children must all stop cleanly before a parent may be stopped.
For a resource to be considered in good health, all its children must be in good health.

C.2. Sibling Start Ordering and Resource Child Ordering

The Service resource determines the start order and the stop order of a child resource according to whether it designates a child-type attribute for a child resource as follows:

Designates child-type attribute (typed child resource) - If the Service resource designates a child-type attribute for a child resource, the child resource is typed. The child-type attribute explicitly determines the start and the stop order of the child resource.
Does not designate child-type attribute (non-typed child resource) - If the Service resource does not designate a child-type attribute for a child resource, the child resource is non-typed. The Service resource does not explicitly control the starting order and stopping order of a non-typed child resource. However, a non-typed child resource is started and stopped according to its order in /etc/cluster/cluster.conf. In addition, non-typed child resources are started after all typed child resources have started and are stopped before any typed child resources have stopped.

Note

The only resource to implement defined child resource type ordering is the Service resource.

For more information about typed child resource start and stop ordering, refer to Section C.2.1, "Typed Child Resource Start and Stop Ordering". For more information about non-typed child resource start and stop ordering, refer to Section C.2.2, "Non-typed Child Resource Start and Stop Ordering".

C.2.1. Typed Child Resource Start and Stop Ordering

For a typed child resource, the type attribute for the child resource defines the start order and the stop order of each resource type with a number that can range from 1 to 100; one value for start, and one value for stop. The lower the number, the earlier a resource type starts or stops. For example, Table C.1, "Child Resource Type Start and Stop Order" shows the start and stop values for each resource type; Example C.2, "Resource Start and Stop Values: Excerpt from Service Resource Agent, service.sh" shows the start and stop values as they appear in the Service resource agent, service.sh. For the Service resource, all LVM children are started first, followed by all File System children, followed by all Script children, and so forth.

Table C.1. Child Resource Type Start and Stop Order

Resource	Child Type	Start-order Value	Stop-order Value
LVM	lvm	1	9
File System	fs	2	8
GFS2 File System	clusterfs	3	7
NFS Mount	netfs	4	6
NFS Export	nfsexport	5	5
NFS Client	nfsclient	6	4
IP Address	ip	7	2
Samba	smb	8	3
Script	script	9	1

Example C.2. Resource Start and Stop Values: Excerpt from Service Resource Agent, service.sh

<special tag="rgmanager"> <attributes root="1" maxinstances="1"/> <child type="lvm" start="1" stop="9"/> <child type="fs" start="2" stop="8"/> <child type="clusterfs" start="3" stop="7"/> <child type="netfs" start="4" stop="6"/> <child type="nfsexport" start="5" stop="5"/> <child type="nfsclient" start="6" stop="4"/> <child type="ip" start="7" stop="2"/> <child type="smb" start="8" stop="3"/> <child type="script" start="9" stop="1"/></special>

Ordering within a resource type is preserved as it exists in the cluster configuration file, /etc/cluster/cluster.conf. For example, consider the starting order and stopping order of the typed child resources in Example C.3, "Ordering Within a Resource Type".

Example C.3. Ordering Within a Resource Type

<service name="foo">  <script name="1" .../>  <lvm name="1" .../>  <ip address="10.1.1.1" .../>  <fs name="1" .../>  <lvm name="2" .../></service>

Typed Child Resource Starting Order

In Example C.3, "Ordering Within a Resource Type", the resources are started in the following order:

lvm:1 - This is an LVM resource. All LVM resources are started first. lvm:1 (<lvm name="1" .../>) is the first LVM resource started among LVM resources because it is the first LVM resource listed in the Service foo portion of /etc/cluster/cluster.conf.
lvm:2 - This is an LVM resource. All LVM resources are started first. lvm:2 (<lvm name="2" .../>) is started after lvm:1 because it is listed after lvm:1 in the Service foo portion of /etc/cluster/cluster.conf.
fs:1 - This is a File System resource. If there were other File System resources in Service foo, they would start in the order listed in the Service foo portion of /etc/cluster/cluster.conf.
ip:10.1.1.1 - This is an IP Address resource. If there were other IP Address resources in Service foo, they would start in the order listed in the Service foo portion of /etc/cluster/cluster.conf.
script:1 - This is a Script resource. If there were other Script resources in Service foo, they would start in the order listed in the Service foo portion of /etc/cluster/cluster.conf.

Typed Child Resource Stopping Order

In Example C.3, "Ordering Within a Resource Type", the resources are stopped in the following order:

script:1 - This is a Script resource. If there were other Script resources in Service foo, they would stop in the reverse order listed in the Service foo portion of /etc/cluster/cluster.conf.
ip:10.1.1.1 - This is an IP Address resource. If there were other IP Address resources in Service foo, they would stop in the reverse order listed in the Service foo portion of /etc/cluster/cluster.conf.
fs:1 - This is a File System resource. If there were other File System resources in Service foo, they would stop in the reverse order listed in the Service foo portion of /etc/cluster/cluster.conf.
lvm:2 - This is an LVM resource. All LVM resources are stopped last. lvm:2 (<lvm name="2" .../>) is stopped before lvm:1; resources within a group of a resource type are stopped in the reverse order listed in the Service foo portion of /etc/cluster/cluster.conf.
lvm:1 - This is an LVM resource. All LVM resources are stopped last. lvm:1 (<lvm name="1" .../>) is stopped after lvm:2; resources within a group of a resource type are stopped in the reverse order listed in the Service foo portion of /etc/cluster/cluster.conf.

C.2.2. Non-typed Child Resource Start and Stop Ordering

Additional considerations are required for non-typed child resources. For a non-typed child resource, starting order and stopping order are not explicitly specified by the Service resource. Instead, starting order and stopping order are determined according to the order of the child resource in /etc/cluster/cluster.conf. Additionally, non-typed child resources are started after all typed child resources and stopped before any typed child resources.

For example, consider the starting order and stopping order of the non-typed child resources in Example C.4, "Non-typed and Typed Child Resource in a Service".

Example C.4. Non-typed and Typed Child Resource in a Service

<service name="foo">  <script name="1" .../>  <nontypedresource name="foo"/>  <lvm name="1" .../>  <nontypedresourcetwo name="bar"/>  <ip address="10.1.1.1" .../>  <fs name="1" .../>  <lvm name="2" .../></service>

Non-typed Child Resource Starting Order

In Example C.4, "Non-typed and Typed Child Resource in a Service", the child resources are started in the following order:

lvm:1 - This is an LVM resource. All LVM resources are started first. lvm:1 (<lvm name="1" .../>) is the first LVM resource started among LVM resources because it is the first LVM resource listed in the Service foo portion of /etc/cluster/cluster.conf.
lvm:2 - This is an LVM resource. All LVM resources are started first. lvm:2 (<lvm name="2" .../>) is started after lvm:1 because it is listed after lvm:1 in the Service foo portion of /etc/cluster/cluster.conf.
fs:1 - This is a File System resource. If there were other File System resources in Service foo, they would start in the order listed in the Service foo portion of /etc/cluster/cluster.conf.
ip:10.1.1.1 - This is an IP Address resource. If there were other IP Address resources in Service foo, they would start in the order listed in the Service foo portion of /etc/cluster/cluster.conf.
script:1 - This is a Script resource. If there were other Script resources in Service foo, they would start in the order listed in the Service foo portion of /etc/cluster/cluster.conf.
nontypedresource:foo - This is a non-typed resource. Because it is a non-typed resource, it is started after the typed resources start. In addition, its order in the Service resource is before the other non-typed resource, nontypedresourcetwo:bar; therefore, it is started before nontypedresourcetwo:bar. (Non-typed resources are started in the order that they appear in the Service resource.)
nontypedresourcetwo:bar - This is a non-typed resource. Because it is a non-typed resource, it is started after the typed resources start. In addition, its order in the Service resource is after the other non-typed resource, nontypedresource:foo; therefore, it is started after nontypedresource:foo. (Non-typed resources are started in the order that they appear in the Service resource.)

Non-typed Child Resource Stopping Order

In Example C.4, "Non-typed and Typed Child Resource in a Service", the child resources are stopped in the following order:

nontypedresourcetwo:bar - This is a non-typed resource. Because it is a non-typed resource, it is stopped before the typed resources are stopped. In addition, its order in the Service resource is after the other non-typed resource, nontypedresource:foo; therefore, it is stopped before nontypedresource:foo. (Non-typed resources are stopped in the reverse order that they appear in the Service resource.)
nontypedresource:foo - This is a non-typed resource. Because it is a non-typed resource, it is stopped before the typed resources are stopped. In addition, its order in the Service resource is before the other non-typed resource, nontypedresourcetwo:bar; therefore, it is stopped after nontypedresourcetwo:bar. (Non-typed resources are stopped in the reverse order that they appear in the Service resource.)
script:1 - This is a Script resource. If there were other Script resources in Service foo, they would stop in the reverse order listed in the Service foo portion of /etc/cluster/cluster.conf.
ip:10.1.1.1 - This is an IP Address resource. If there were other IP Address resources in Service foo, they would stop in the reverse order listed in the Service foo portion of /etc/cluster/cluster.conf.
fs:1 - This is a File System resource. If there were other File System resources in Service foo, they would stop in the reverse order listed in the Service foo portion of /etc/cluster/cluster.conf.
lvm:2 - This is an LVM resource. All LVM resources are stopped last. lvm:2 (<lvm name="2" .../>) is stopped before lvm:1; resources within a group of a resource type are stopped in the reverse order listed in the Service foo portion of /etc/cluster/cluster.conf.
lvm:1 - This is an LVM resource. All LVM resources are stopped last. lvm:1 (<lvm name="1" .../>) is stopped after lvm:2; resources within a group of a resource type are stopped in the reverse order listed in the Service foo portion of /etc/cluster/cluster.conf.

C.3. Inheritance, the <resources> Block, and Reusing Resources

Some resources benefit by inheriting values from a parent resource; that is commonly the case in an NFS service. Example C.5, "NFS Service Set Up for Resource Reuse and Inheritance" shows a typical NFS service configuration, set up for resource reuse and inheritance.

Example C.5. NFS Service Set Up for Resource Reuse and Inheritance

 <resources> <nfsclient name="bob" target="bob.example.com" options="rw,no_root_squash"/> <nfsclient name="jim" target="jim.example.com" options="rw,no_root_squash"/> <nfsexport name="exports"/> </resources> <service name="foo"> <fs name="1" mountpoint="/mnt/foo" device="/dev/sdb1" fsid="12344"> <nfsexport ref="exports">  <!-- nfsexport's path and fsid attributes are inherited from the mountpoint & fsid attribute of the parent fs resource --> <nfsclient ref="bob"/> <!-- nfsclient's path is inherited from the mountpoint and the fsid is added to the options string during export --> <nfsclient ref="jim"/> </nfsexport> </fs> <fs name="2" mountpoint="/mnt/bar" device="/dev/sdb2" fsid="12345"> <nfsexport ref="exports"> <nfsclient ref="bob"/> <!-- Because all of the critical data for this resource is either defined in the resources block or inherited, we can reference it again! --> <nfsclient ref="jim"/> </nfsexport> </fs> <ip address="10.2.13.20"/> </service>

If the service were flat (that is, with no parent/child relationships), it would need to be configured as follows:

The service would need four nfsclient resources - one per file system (a total of two for file systems), and one per target machine (a total of two for target machines).
The service would need to specify export path and file system ID to each nfsclient, which introduces chances for errors in the configuration.

In Example C.5, "NFS Service Set Up for Resource Reuse and Inheritance" however, the NFS client resources nfsclient:bob and nfsclient:jim are defined once; likewise, the NFS export resource nfsexport:exports is defined once. All the attributes needed by the resources are inherited from parent resources. Because the inherited attributes are dynamic (and do not conflict with one another), it is possible to reuse those resources - which is why they are defined in the resources block. It may not be practical to configure some resources in multiple places. For example, configuring a file system resource in multiple places can result in mounting one file system on two nodes, therefore causing problems.

C.4. Failure Recovery and Independent Subtrees

In most enterprise environments, the normal course of action for failure recovery of a service is to restart the entire service if any component in the service fails. For example, in Example C.6, "Service foo Normal Failure Recovery", if any of the scripts defined in this service fail, the normal course of action is to restart (or relocate or disable, according to the service recovery policy) the service. However, in some circumstances certain parts of a service may be considered non-critical; it may be necessary to restart only part of the service in place before attempting normal recovery. To accomplish that, you can use the __independent_subtree attribute. For example, in Example C.7, "Service foo Failure Recovery with __independent_subtree Attribute", the __independent_subtree attribute is used to accomplish the following actions:

If script:script_one fails, restart script:script_one, script:script_two, and script:script_three.
If script:script_two fails, restart just script:script_two.
If script:script_three fails, restart script:script_one, script:script_two, and script:script_three.
If script:script_four fails, restart the whole service.

Example C.6. Service foo Normal Failure Recovery

<service name="foo">  <script name="script_one" ...>  <script name="script_two" .../>  </script>  <script name="script_three" .../></service>

Example C.7. Service foo Failure Recovery with __independent_subtree Attribute

<service name="foo">  <script name="script_one" __independent_subtree="1" ...>  <script name="script_two" __independent_subtree="1" .../>  <script name="script_three" .../>  </script>  <script name="script_four" .../></service>

In some circumstances, if a component of a service fails you may want to disable only that component without disabling the entire service, to avoid affecting other services that use other components of that service. As of the Red Hat Enterprise Linux 6.1 release, you can accomplish that by using the __independent_subtree="2" attribute, which designates the independent subtree as non-critical.

Note

You may only use the non-critical flag on singly-referenced resources. The non-critical flag works with all resources at all levels of the resource tree, but should not be used at the top level when defining services or virtual machines.

As of the Red Hat Enterprise Linux 6.1 release, you can set maximum restart and restart expirations on a per-node basis in the resource tree for independent subtrees. To set these thresholds, you can use the following attributes:

__max_restarts configures the maximum number of tolerated restarts prior to giving up.
__restart_expire_time configures the amount of time, in seconds, after which a restart is no longer attempted.

C.5. Debugging and Testing Services and Resource Ordering

You can debug and test services and resource ordering with the rg_test utility. rg_test is a command-line utility provided by the rgmanager package that is run from a shell or a terminal (it is not available in Conga). Table C.2, "rg_test Utility Summary" summarizes the actions and syntax for the rg_test utility.

Table C.2. rg_test Utility Summary

Action	Syntax
Display the resource rules that `rg_test` understands.	`rg_test rules`
Test a configuration (and /usr/share/cluster) for errors or redundant resource agents.	`rg_test test /etc/cluster/cluster.conf`
Display the start and stop ordering of a service.	Display start order: `rg_test noop /etc/cluster/cluster.conf start service servicename` Display stop order: `rg_test noop /etc/cluster/cluster.conf stop service servicename`
Explicitly start or stop a service.	Important Only do this on one node, and always disable the service in rgmanager first. Start a service: `rg_test test /etc/cluster/cluster.conf start service servicename` Stop a service: `rg_test test /etc/cluster/cluster.conf stop service servicename`
Calculate and display the resource tree delta between two cluster.conf files.	`rg_test delta cluster.conf file 1 cluster.conf file 2` For example: `rg_test delta /etc/cluster/cluster.conf.bak /etc/cluster/cluster.conf`

Cluster Service Resource Check and Failover Timeout

This appendix describes how rgmanager monitors the status of cluster resources, and how to modify the status check interval. The appendix also describes the __enforce_timeouts service parameter, which indicates that a timeout for an operation should cause a service to fail.

Note

To fully comprehend the information in this appendix, you may require detailed understanding of resource agents and the cluster configuration file, /etc/cluster/cluster.conf. For a comprehensive list and description of cluster.conf elements and attributes, refer to the cluster schema at /usr/share/cluster/cluster.rng, and the annotated schema at /usr/share/doc/cman-X.Y.ZZ/cluster_conf.html (for example /usr/share/doc/cman-3.0.12/cluster_conf.html).

D.1. Modifying the Resource Status Check Interval

rgmanager checks the status of individual resources, not whole services. Every 10 seconds, rgmanager scans the resource tree, looking for resources that have passed their "status check" interval.

Each resource agent specifies the amount of time between periodic status checks. Each resource utilizes these timeout values unless explicitly overridden in the cluster.conf file using the special <action> tag:

<action name="status" depth="*" interval="10" />

This tag is a special child of the resource itself in the cluster.conf file. For example, if you had a file system resource for which you wanted to override the status check interval you could specify the file system resource in the cluster.conf file as follows:

  <fs name="test" device="/dev/sdb3"> <action name="status" depth="*" interval="10" /> <nfsexport...> </nfsexport>  </fs>

Some agents provide multiple "depths" of checking. For example, a normal file system status check (depth 0) checks whether the file system is mounted in the correct place. A more intensive check is depth 10, which checks whether you can read a file from the file system. A status check of depth 20 checks whether you can write to the file system. In the example given here, the depth is set to *, which indicates that these values should be used for all depths. The result is that the test file system is checked at the highest-defined depth provided by the resource-agent (in this case, 20) every 10 seconds.

D.2. Enforcing Resource Timeouts

There is no timeout for starting, stopping, or failing over resources. Some resources take an indeterminately long amount of time to start or stop. Unfortunately, a failure to stop (including a timeout) renders the service inoperable (failed state). You can, if desired, turn on timeout enforcement on each resource in a service individually by adding __enforce_timeouts="1" to the reference in the cluster.conf file.

The following example shows a cluster service that has been configured with the __enforce_timeouts attribute set for the netfs resource. With this attribute set, then if it takes more than 30 seconds to unmount the NFS file system during a recovery process the operation will time out, causing the service to enter the failed state.

</screen><rm>  <failoverdomains/>  <resources> <netfs export="/nfstest" force_unmount="1" fstype="nfs" host="10.65.48.65" mountpoint="/data/nfstest" name="nfstest_data" options="rw,sync,soft"/>  </resources>  <service autostart="1" exclusive="0" name="nfs_client_test" recovery="relocate"> <netfs ref="nfstest_data" __enforce_timeouts="1"/>  </service></rm>

Command Line Tools Summary

Table E.1, "Command Line Tool Summary" summarizes preferred command-line tools for configuring and managing the High Availability Add-On. For more information about commands and variables, refer to the man page for each command-line tool.

Table E.1. Command Line Tool Summary

Command Line Tool	Used With	Purpose
`ccs_config_dump` - Cluster Configuration Dump Tool	Cluster Infrastructure	`ccs_config_dump` generates XML output of running configuration. The running configuration is, sometimes, different from the stored configuration on file because some subsystems store or set some default information into the configuration. Those values are generally not present on the on-disk version of the configuration but are required at runtime for the cluster to work properly. For more information about this tool, refer to the ccs_config_dump(8) man page.
`ccs_config_validate` - Cluster Configuration Validation Tool	Cluster Infrastructure	`ccs_config_validate` validates `cluster.conf` against the schema, `cluster.rng` (located in `/usr/share/cluster/cluster.rng` on each node). For more information about this tool, refer to the ccs_config_validate(8) man page.
`clustat` - Cluster Status Utility	High-availability Service Management Components	The `clustat` command displays the status of the cluster. It shows membership information, quorum view, and the state of all configured user services. For more information about this tool, refer to the clustat(8) man page.
`clusvcadm` - Cluster User Service Administration Utility	High-availability Service Management Components	The `clusvcadm` command allows you to enable, disable, relocate, and restart high-availability services in a cluster. For more information about this tool, refer to the clusvcadm(8) man page.
`cman_tool` - Cluster Management Tool	Cluster Infrastructure	`cman_tool` is a program that manages the CMAN cluster manager. It provides the capability to join a cluster, leave a cluster, kill a node, or change the expected quorum votes of a node in a cluster. For more information about this tool, refer to the cman_tool(8) man page.
`fence_tool` - Fence Tool	Cluster Infrastructure	`fence_tool` is a program used to join and leave the fence domain. For more information about this tool, refer to the fence_tool(8) man page.

High Availability LVM (HA-LVM)

The Red Hat High Availability Add-On provides support for high availability LVM volumes (HA-LVM) in a failover configuration. This is distinct from active/active configurations enabled by the Clustered Logical Volume Manager (CLVM), which is a set of clustering extensions to LVM that allow a cluster of computers to manage shared storage.

When to use CLVM or HA-LVM should be based on the needs of the applications or services being deployed.

If the applications are cluster-aware and have been tuned to run simultaneously on multiple machines at a time, then CLVM should be used. Specifically, if more than one node of your cluster will require access to your storage which is then shared among the active nodes, then you must use CLVM. CLVM allows a user to configure logical volumes on shared storage by locking access to physical storage while a logical volume is being configured, and uses clustered locking services to manage the shared storage. For information on CLVM, and on LVM configuration in general, refer to Logical Volume Manager Administration.
If the applications run optimally in active/passive (failover) configurations where only a single node that accesses the storage is active at any one time, you should use High Availability Logical Volume Management agents (HA-LVM).

Most applications will run better in an active/passive configuration, as they are not designed or optimized to run concurrently with other instances. Choosing to run an application that is not cluster-aware on clustered logical volumes may result in degraded performance if the logical volume is mirrored. This is because there is cluster communication overhead for the logical volumes themselves in these instances. A cluster-aware application must be able to achieve performance gains above the performance losses introduced by cluster file systems and cluster-aware logical volumes. This is achievable for some applications and workloads more easily than others. Determining what the requirements of the cluster are and whether the extra effort toward optimizing for an active/active cluster will pay dividends is the way to choose between the two LVM variants. Most users will achieve the best HA results from using HA-LVM.

HA-LVM and CLVM are similar in the fact that they prevent corruption of LVM metadata and its logical volumes, which could otherwise occur if multiple machines where allowed to make overlapping changes. HA-LVM imposes the restriction that a logical volume can only be activated exclusively; that is, active on only one machine at a time. This means that only local (non-clustered) implementations of the storage drivers are used. Avoiding the cluster coordination overhead in this way increases performance. CLVM does not impose these restrictions - a user is free to activate a logical volume on all machines in a cluster; this forces the use of cluster-aware storage drivers, which allow for cluster-aware file systems and applications to be put on top.

HA-LVM can be setup to use one of two methods for achieving its mandate of exclusive logical volume activation.

The preferred method uses CLVM, but it will only ever activate the logical volumes exclusively. This has the advantage of easier setup and better prevention of administrative mistakes (like removing a logical volume that is in use). In order to use CLVM, the High Availability Add-On and Resilient Storage Add-On software, including the clvmd daemon, must be running.
The procedure for configuring HA-LVM using this method is described in Section F.1, "Configuring HA-LVM Failover with CLVM (preferred)".
The second method uses local machine locking and LVM "tags". This method has the advantage of not requiring any LVM cluster packages; however, there are more steps involved in setting it up and it does not prevent an administrator from mistakenly removing a logical volume from a node in the cluster where it is not active. The procedure for configuring HA-LVM using this method is described in Section F.2, "Configuring HA-LVM Failover with Tagging".

F.1. Configuring HA-LVM Failover with CLVM (preferred)

To set up HA-LVM failover (using the preferred CLVM variant), perform the following steps:

Ensure that your system is configured to support CLVM, which requires the following:
- The High Availability Add-On and Resilient Storage Add-On are installed, including the the cmirror package if the CLVM logical volumes are to be mirrored.
- The locking_type parameter in the global section of the /etc/lvm/lvm.conf file is set to the value '3'.
- The High Availability Add-On and Resilient Storage Add-On software, including the clvmd daemon, must be running. For CLVM mirroring, the cmirrord service must be started as well.
Create the logical volume and file system using standard LVM and file system commands, as in the following example.
```
# pvcreate /dev/sd[cde]1# vgcreate -cy shared_vg /dev/sd[cde]1# lvcreate -L 10G -n ha_lv shared_vg# mkfs.ext4 /dev/shared_vg/ha_lv# lvchange -an shared_vg/ha_lv
```
For information on creating LVM logical volumes, refer to Logical Volume Manager Administration.

Edit the /etc/cluster/cluster.conf file to include the newly created logical volume as a resource in one of your services. Alternately, you can use Conga or the ccs command to configure LVM and file system resources for the cluster. The following is a sample resource manager section from the /etc/cluster/cluster.conf file that configures a CLVM logical volume as a cluster resource:

<rm> <failoverdomains>   <failoverdomain name="FD" ordered="1" restricted="0">  <failoverdomainnode name="neo-01" priority="1"/>  <failoverdomainnode name="neo-02" priority="2"/>   </failoverdomain>   </failoverdomains>   <resources>   <lvm name="lvm" vg_name="shared_vg" lv_name="ha-lv"/>   <fs name="FS" device="/dev/shared_vg/ha-lv" force_fsck="0" force_unmount="1" fsid="64050" fstype="ext4" mountpoint="/mnt" options="" self_fence="0"/>   </resources>   <service autostart="1" domain="FD" name="serv" recovery="relocate">   <lvm ref="lvm"/>   <fs ref="FS"/>   </service></rm>

F.2. Configuring HA-LVM Failover with Tagging

To set up HA-LVM failover by using tags in the /etc/lvm/lvm.conf file, perform the following steps:

Ensure that the locking_type parameter in the global section of the /etc/lvm/lvm.conf file is set to the value '1'.
Create the logical volume and file system using standard LVM and file system commands, as in the following example.
```
# pvcreate /dev/sd[cde]1# vgcreate shared_vg /dev/sd[cde]1# lvcreate -L 10G -n ha_lv shared_vg# mkfs.ext4 /dev/shared_vg/ha_lv
```
For information on creating LVM logical volumes, refer to Logical Volume Manager Administration.
Edit the /etc/cluster/cluster.conf file to include the newly created logical volume as a resource in one of your services. Alternately, you can use Conga or the ccs command to configure LVM and file system resources for the cluster. The following is a sample resource manager section from the /etc/cluster/cluster.conf file that configures a CLVM logical volume as a cluster resource:
```
<rm> <failoverdomains>   <failoverdomain name="FD" ordered="1" restricted="0">  <failoverdomainnode name="neo-01" priority="1"/>  <failoverdomainnode name="neo-02" priority="2"/>   </failoverdomain>   </failoverdomains>   <resources>   <lvm name="lvm" vg_name="shared_vg" lv_name="ha_lv"/>   <fs name="FS" device="/dev/shared_vg/ha_lv" force_fsck="0" force_unmount="1" fsid="64050" fstype="ext4" mountpoint="/mnt" options="" self_fence="0"/>   </resources>   <service autostart="1" domain="FD" name="serv" recovery="relocate">   <lvm ref="lvm"/>   <fs ref="FS"/>   </service></rm>
```
Note
If there are multiple logical volumes in the volume group, then the logical volume name (lv_name) in the lvm resource should be left blank or unspecified. Also note that in an HA-LVM configuration, a volume group may be used by only a single service.
Edit the volume_list field in the /etc/lvm/lvm.conf file. Include the name of your root volume group and your hostname as listed in the /etc/cluster/cluster.conf file preceded by @. The hostname to include here is the machine on which you are editing the lvm.conf file, not any remote hostname. Note that this string MUST match the node name given in the cluster.conf file. Below is a sample entry from the /etc/lvm/lvm.conf file:
```
volume_list = [ "VolGroup00", "@neo-01" ]
```
This tag will be used to activate shared VGs or LVs. DO NOT include the names of any volume groups that are to be shared using HA-LVM.

Update the initrd device on all your cluster nodes:

# dracut -H -f /boot/initramfs-$(uname -r).img $(uname -r)

Reboot all nodes to ensure the correct initrd device is in use.

Revision History

Revision History

Revision 5.0-25 Mon Feb 18 2013 Steven Levine

Version for 6.4 GA release

Revision 5.0-23 Wed Jan 30 2013 Steven Levine

Resolves: 901641

Corrects and clarifies iptables rules.

Revision 5.0-22 Tue Jan 29 2013 Steven Levine

Resolves: 788636

Documents RRP configuration through ccs command.

Resolves: 789010

Document RRP configurtion in the cluster.conf file.

Revision 5.0-20 Fri Jan 18 2013 Steven Levine

Resolves: 894097

Removes advice to ensure you are not using VLAN tagging.

Resolves: 845365

Indicates that bonding modes 0 and 2 are now supported.

Revision 5.0-19 Thu Jan 17 2013 Steven Levine

Resolves: 896234

Clarifies terminology of cluster node references.

Revision 5.0-16 Mon Nov 26 2012 Steven Levine

Version for 6.4 Beta release

Revision 5.0-15 Wed Nov 20 2012 Steven Levine

Resolves: 838988

Documents nfsrestart attribute for file system resource agents.

Resolves: 843169

Documents IBM iPDU fence agent.

Resolves: 846121

Documents Eaton Network Power Controller (SNMP Interface) fence agent.

Resolves: 856834

Documents HP Bladesystem fence agent.

Resolves: 865313

Documents NFS Server resource agent.

Resolves: 862281

Clarifies which ccs commands overwrite previous settings.

Resolves: 846205

Documents iptables firewall filtering for igmp component.

Resolves: 857172

Documents ability to remove users from luci.

Resolves: 857165

Documents the privilege level parameter of the IPMI fence agent.

Resolves: 840912

Clears up formatting issue with resource parameter table.

Resolves: 849240, 870292

Clarifies installation procedure.

Resolves: 871165

Clarifies description of IP address parameter in description of IP address resource agent.

Resolves: 845333, 869039, 856681

Fixes small typographical errors and clarifies small technical ambiguities.

Revision 5.0-12 Thu Nov 1 2012 Steven Levine

Added newly-supported fence agents.

Revision 5.0-7 Thu Oct 25 2012 Steven Levine

Added section on override semantics.

Revision 5.0-6 Tue Oct 23 2012 Steven Levine

Fixed default value of Post Join Delay.

Revision 5.0-4 Tue Oct 16 2012 Steven Levine

Added description of NFS server resource.

Revision 5.0-2 Thu Oct 11 2012 Steven Levine

Updates to Conga descriptions.

Revision 5.0-1 Mon Oct 8 2012 Steven Levine

Clarifying ccs semantics

Revision 4.0-5 Fri Jun 15 2012 Steven Levine

Version for 6.3 GA release

Revision 4.0-4 Tue Jun 12 2012 Steven Levine

Resolves: 830148

Ensures consistency of port number examples for luci.

Revision 4.0-3 Tue May 21 2012 Steven Levine

Resolves: 696897

Adds cluster.conf parameter information to tables of fence device parameters and resource parameters.

Resolves: 811643

Adds procedure for restoring a luci database on a separate machine.

Revision 4.0-2 Wed Apr 25 2012 Steven Levine

Resolves: 815619

Removes warning about using UDP Unicast with GFS2 file systems.

Revision 4.0-1 Fri Mar 30 2012 Steven Levine

Resolves: 771447, 800069, 800061

Updates documentation of luci to be consistent with Red Hat Enterprise Linux 6.3 version.

Resolves: 712393

Adds information on capturing an application core for RGManager.

Resolves: 800074

Documents condor resource agent.

Resolves: 757904

Documents luci configuration backup and restore.

Resolves: 772374

Adds section on managing virtual machines in a cluster.

Resolves: 712378

Adds documentation for HA-LVM configuration.

Resolves: 712400

Documents debug options.

Resolves: 751156

Documents new fence_ipmilan parameter.

Resolves: 721373

Documents which configuration changes require a cluster restart.

Revision 3.0-5 Thu Dec 1 2011 Steven Levine

Release for GA of Red Hat Enterprise Linux 6.2

Resolves: 755849

Corrects monitor_link parameter example.

Revision 3.0-4 Mon Nov 7 2011 Steven Levine

Resolves: 749857

Adds documentation for RHEV-M REST API fence device.

Revision 3.0-3 Fri Oct 21 2011 Steven Levine

Resolves: #747181, #747182, #747184, #747185, #747186, #747187, #747188, #747189, #747190, #747192

Corrects typographical errors and ambiguities found during documentation QE review for Red Hat Enterprise Linux 6.2.

Revision 3.0-2 Fri Oct 7 2011 Steven Levine

Resolves: #743757

Corrects reference to supported bonding mode in troubleshooting section.

Revision 3.0-1 Wed Sep 28 2011 Steven Levine

Initial revision for Red Hat Enterprise Linux 6.2 Beta release

Resolves: #739613

Documents support for new ccs options to display available fence devices and available services.

Resolves: #707740

Documents updates to the Conga interface and documents support for setting user permissions to administer Conga.

Resolves: #731856

Documents supports for configuring luci by means of the /etc/sysconfig/luci file.

Resolves: #736134

Documents support for UDPU transport.

Resolves: #736143

Documents support for clustered Samba.

Resolves: #617634

Documents how to configure the only IP address luci is served at.

Resolves: #713259

Documents support for fence_vmware_soap agent.

Resolves: #721009

Provides link to Support Essentials article.

Resolves: #717006

Provides information on allowing multicast traffic through the iptables firewall.

Resolves: #717008

Provides information about cluster service status check and failover timeout.

Resolves: #711868

Clarifies description of autostart.

Resolves: #728337

Documents procedure for adding vm resources with the ccs command.

Resolves: #725315, #733011, #733074, #733689

Corrects small typographical errors.

Revision 2.0-1 Thu May 19 2011 Steven Levine

Initial revision for Red Hat Enterprise Linux 6.1

Resolves: #671250

Documents support for SNMP traps.

Resolves: #659753

Documents ccs command.

Resolves: #665055

Updates Conga documentation to reflect updated display and feature support.

Resolves: #680294

Documents need for password access for ricci agent.

Resolves: #687871

Adds chapter on troubleshooting.

Resolves: #673217

Fixes typographical error.

Resolves: #675805

Adds reference to cluster.conf schema to tables of HA resource parameters.

Resolves: #672697

Updates tables of fence device parameters to include all currently supported fencing devices.

Resolves: #677994

Corrects information for fence_ilo fence agent parameters.

Resolves: #629471

Adds technical note about setting consensus value in a two-node cluster.

Resolves: #579585

Updates section on upgrading Red Hat High Availability Add-On Software.

Resolves: #643216

Clarifies small issues throughout document.

Resolves: #643191

Provides improvements and corrections for the luci documentation.

Resolves: #704539

Updates the table of Virtual Machine resource parameters.

Revision 1.0-1 Wed Nov 10 2010 Paul Kennedy

Initial release for Red Hat Enterprise Linux 6

Index

A

ACPI

configuring, Configuring ACPI For Use with Integrated Fence Devices

APC power switch over SNMP fence device , Fence Device Parameters

APC power switch over telnet/SSH fence device , Fence Device Parameters

B

behavior, HA resources, HA Resource Behavior
Brocade fabric switch fence device , Fence Device Parameters

C

CISCO MDS fence device , Fence Device Parameters

Cisco UCS fence device , Fence Device Parameters

cluster

administration, Before Configuring the Red Hat High Availability Add-On, Managing Red Hat High Availability Add-On With Conga, Managing Red Hat High Availability Add-On With ccs, Managing Red Hat High Availability Add-On With Command Line Tools
diagnosing and correcting problems, Diagnosing and Correcting Problems in a Cluster, Diagnosing and Correcting Problems in a Cluster
starting, stopping, restarting, Starting and Stopping the Cluster Software

cluster administration, Before Configuring the Red Hat High Availability Add-On, Managing Red Hat High Availability Add-On With Conga, Managing Red Hat High Availability Add-On With ccs, Managing Red Hat High Availability Add-On With Command Line Tools

adding cluster node, Adding a Member to a Running Cluster, Adding a Member to a Running Cluster
compatible hardware, Compatible Hardware
configuration validation, Configuration Validation
configuring ACPI, Configuring ACPI For Use with Integrated Fence Devices
configuring iptables, Enabling IP Ports
considerations for using qdisk, Considerations for Using Quorum Disk
considerations for using quorum disk, Considerations for Using Quorum Disk
deleting a cluster, Starting, Stopping, Restarting, and Deleting Clusters
deleting a node from the configuration; adding a node to the configuration , Deleting or Adding a Node
diagnosing and correcting problems in a cluster, Diagnosing and Correcting Problems in a Cluster, Diagnosing and Correcting Problems in a Cluster
displaying HA services with clustat, Displaying HA Service Status with clustat
enabling IP ports, Enabling IP Ports
general considerations, General Configuration Considerations
joining a cluster, Causing a Node to Leave or Join a Cluster, Causing a Node to Leave or Join a Cluster
leaving a cluster, Causing a Node to Leave or Join a Cluster, Causing a Node to Leave or Join a Cluster
managing cluster node, Managing Cluster Nodes, Managing Cluster Nodes
managing high-availability services, Managing High-Availability Services, Managing High-Availability Services
managing high-availability services, freeze and unfreeze, Managing HA Services with clusvcadm, Considerations for Using the Freeze and Unfreeze Operations
network switches and multicast addresses, Multicast Addresses
NetworkManager, Considerations for NetworkManager
rebooting cluster node, Rebooting a Cluster Node
removing cluster node, Deleting a Member from a Cluster
restarting a cluster, Starting, Stopping, Restarting, and Deleting Clusters
ricci considerations, Considerations for ricci
SELinux, Red Hat High Availability Add-On and SELinux
starting a cluster, Starting, Stopping, Restarting, and Deleting Clusters, Starting and Stopping a Cluster
starting, stopping, restarting a cluster, Starting and Stopping the Cluster Software
stopping a cluster, Starting, Stopping, Restarting, and Deleting Clusters, Starting and Stopping a Cluster
updating a cluster configuration using cman_tool version -r, Updating a Configuration Using cman_tool version -r
updating a cluster configuration using scp, Updating a Configuration Using scp
updating configuration, Updating a Configuration
virtual machines, Configuring Virtual Machines in a Clustered Environment

cluster configuration, Configuring Red Hat High Availability Add-On With Conga, Configuring Red Hat High Availability Add-On With the ccs Command, Configuring Red Hat High Availability Add-On With Command Line Tools

deleting or adding a node, Deleting or Adding a Node
updating, Updating a Configuration

cluster resource relationships, Parent, Child, and Sibling Relationships Among Resources

cluster resource status check, Cluster Service Resource Check and Failover Timeout

cluster resource types, Considerations for Configuring HA Services

cluster service managers

configuration, Adding a Cluster Service to the Cluster, Adding a Cluster Service to the Cluster, Adding a Cluster Service to the Cluster

cluster services, Adding a Cluster Service to the Cluster, Adding a Cluster Service to the Cluster, Adding a Cluster Service to the Cluster

(see also adding to the cluster configuration)

cluster software

configuration, Configuring Red Hat High Availability Add-On With Conga, Configuring Red Hat High Availability Add-On With the ccs Command, Configuring Red Hat High Availability Add-On With Command Line Tools

configuration

HA service, Considerations for Configuring HA Services

Configuring High Availability LVM, High Availability LVM (HA-LVM)

Conga

accessing, Configuring Red Hat High Availability Add-On Software

consensus value, The consensus Value for totem in a Two-Node Cluster

D

Dell DRAC 5 fence device , Fence Device Parameters

E

Eaton network power switch, Fence Device Parameters
Egenera SAN controller fence device , Fence Device Parameters
ePowerSwitch fence device , Fence Device Parameters

F

failover timeout, Cluster Service Resource Check and Failover Timeout

features, new and changed, New and Changed Features

feedback, Feedback

fence agent

fence_apc, Fence Device Parameters
fence_apc_snmp, Fence Device Parameters
fence_bladecenter, Fence Device Parameters
fence_brocade, Fence Device Parameters
fence_cisco_mds, Fence Device Parameters
fence_cisco_ucs, Fence Device Parameters
fence_drac5, Fence Device Parameters
fence_eaton_snmp, Fence Device Parameters
fence_egenera, Fence Device Parameters
fence_eps, Fence Device Parameters
fence_hpblade, Fence Device Parameters
fence_ibmblade, Fence Device Parameters
fence_ifmib, Fence Device Parameters
fence_ilo, Fence Device Parameters
fence_ilo_mp, Fence Device Parameters
fence_intelmodular, Fence Device Parameters
fence_ipdu, Fence Device Parameters
fence_ipmilan, Fence Device Parameters
fence_rhevm, Fence Device Parameters
fence_rsb, Fence Device Parameters
fence_scsi, Fence Device Parameters
fence_virt, Fence Device Parameters
fence_vmware_soap, Fence Device Parameters
fence_wti, Fence Device Parameters

fence device

APC power switch over SNMP, Fence Device Parameters
APC power switch over telnet/SSH, Fence Device Parameters
Brocade fabric switch, Fence Device Parameters
Cisco MDS, Fence Device Parameters
Cisco UCS, Fence Device Parameters
Dell DRAC 5, Fence Device Parameters
Eaton network power switch, Fence Device Parameters
Egenera SAN controller, Fence Device Parameters
ePowerSwitch, Fence Device Parameters
Fence virt, Fence Device Parameters
Fujitsu Siemens Remoteview Service Board (RSB), Fence Device Parameters
HP BladeSystem, Fence Device Parameters
HP iLO MP, Fence Device Parameters
HP iLO/iLO2, Fence Device Parameters
IBM BladeCenter, Fence Device Parameters
IBM BladeCenter SNMP, Fence Device Parameters
IBM iPDU, Fence Device Parameters
IF MIB, Fence Device Parameters
Intel Modular, Fence Device Parameters
IPMI LAN, Fence Device Parameters
RHEV-M REST API, Fence Device Parameters
SCSI fencing, Fence Device Parameters
VMware (SOAP interface), Fence Device Parameters
WTI power switch, Fence Device Parameters

Fence virt fence device , Fence Device Parameters

fence_apc fence agent, Fence Device Parameters

fence_apc_snmp fence agent, Fence Device Parameters

fence_bladecenter fence agent, Fence Device Parameters

fence_brocade fence agent, Fence Device Parameters

fence_cisco_mds fence agent, Fence Device Parameters

fence_cisco_ucs fence agent, Fence Device Parameters

fence_drac5 fence agent, Fence Device Parameters

fence_eaton_snmp fence agent, Fence Device Parameters

fence_egenera fence agent, Fence Device Parameters

fence_eps fence agent, Fence Device Parameters

fence_hpblade fence agent, Fence Device Parameters

fence_ibmblade fence agent, Fence Device Parameters

fence_ifmib fence agent, Fence Device Parameters

fence_ilo fence agent, Fence Device Parameters

fence_ilo_mp fence agent, Fence Device Parameters

fence_intelmodular fence agent, Fence Device Parameters

fence_ipdu fence agent, Fence Device Parameters

fence_ipmilan fence agent, Fence Device Parameters

fence_rhevm fence agent, Fence Device Parameters

fence_rsb fence agent, Fence Device Parameters

fence_scsi fence agent, Fence Device Parameters

fence_virt fence agent, Fence Device Parameters

fence_vmware_soap fence agent, Fence Device Parameters

fence_wti fence agent, Fence Device Parameters

Fujitsu Siemens Remoteview Service Board (RSB) fence device, Fence Device Parameters

G

general

considerations for cluster administration, General Configuration Considerations

H

HA service configuration

overview, Considerations for Configuring HA Services

hardware

compatible, Compatible Hardware

HP Bladesystem fence device , Fence Device Parameters

HP iLO MP fence device , Fence Device Parameters

HP iLO/iLO2 fence device, Fence Device Parameters

I

IBM BladeCenter fence device , Fence Device Parameters

IBM BladeCenter SNMP fence device , Fence Device Parameters

IBM iPDU fence device , Fence Device Parameters

IF MIB fence device , Fence Device Parameters

integrated fence devices

configuring ACPI, Configuring ACPI For Use with Integrated Fence Devices

Intel Modular fence device , Fence Device Parameters

introduction, Introduction

other Red Hat Enterprise Linux documents, Introduction

IP ports

enabling, Enabling IP Ports

IPMI LAN fence device , Fence Device Parameters

iptables

configuring, Enabling IP Ports

iptables firewall, Configuring the iptables Firewall to Allow Cluster Components

L

LVM, High Availability, High Availability LVM (HA-LVM)

M

multicast addresses

considerations for using with network switches and multicast addresses, Multicast Addresses

multicast traffic, enabling, Configuring the iptables Firewall to Allow Cluster Components

N

NetworkManager

disable for use with cluster, Considerations for NetworkManager

O

overview

features, new and changed, New and Changed Features

P

parameters, fence device, Fence Device Parameters
parameters, HA resources, HA Resource Parameters

Q

qdisk

considerations for using, Considerations for Using Quorum Disk

quorum disk

considerations for using, Considerations for Using Quorum Disk

R

relationships

cluster resource, Parent, Child, and Sibling Relationships Among Resources

RHEV-M REST API fence device , Fence Device Parameters

ricci

considerations for cluster administration, Considerations for ricci

S

SCSI fencing, Fence Device Parameters

SELinux

configuring, Red Hat High Availability Add-On and SELinux

status check, cluster resource, Cluster Service Resource Check and Failover Timeout

T

tables

fence devices, parameters, Fence Device Parameters
HA resources, parameters, HA Resource Parameters

timeout failover, Cluster Service Resource Check and Failover Timeout

tools, command line, Command Line Tools Summary

totem tag

consensus value, The consensus Value for totem in a Two-Node Cluster

troubleshooting

diagnosing and correcting problems in a cluster, Diagnosing and Correcting Problems in a Cluster, Diagnosing and Correcting Problems in a Cluster

types

cluster resource, Considerations for Configuring HA Services

V

validation

cluster configuration, Configuration Validation

virtual machines, in a cluster, Configuring Virtual Machines in a Clustered Environment

VMware (SOAP interface) fence device , Fence Device Parameters

W

WTI power switch fence device , Fence Device Parameters

Source : http://www.redhat.com

(Sebelumnya) 12 : HA Resource Parameters - ...

13 : Deployment Guide (Berikutnya)