| Cluster AdministrationThis appendix describes common behavior of HA resources. It is meant to provide ancillary information that may be helpful in configuring HA services. You can configure the parameters with luci or by editing /etc/cluster/cluster.conf . For descriptions of HA resource parameters, refer to Appendix B, HA Resource Parameters. To understand resource agents in more detail you can view them in /usr/share/cluster of any cluster node. To fully comprehend the information in this appendix, you may require detailed understanding of resource agents and the cluster configuration file, /etc/cluster/cluster.conf . An HA service is a group of cluster resources configured into a coherent entity that provides specialized services to clients. An HA service is represented as a resource tree in the cluster configuration file, /etc/cluster/cluster.conf (in each cluster node). In the cluster configuration file, each resource tree is an XML representation that specifies each resource, its attributes, and its relationship among other resources in the resource tree (parent, child, and sibling relationships). Because an HA service consists of resources organized into a hierarchical tree, a service is sometimes referred to as a resource tree or resource group. Both phrases are synonymous with HA service. At the root of each resource tree is a special type of resource - a service resource. Other types of resources comprise the rest of a service, determining its characteristics. Configuring an HA service consists of creating a service resource, creating subordinate cluster resources, and organizing them into a coherent entity that conforms to hierarchical restrictions of the service. This appendix consists of the following sections: The sections that follow present examples from the cluster configuration file, /etc/cluster/cluster.conf , for illustration purposes only. C.1. Parent, Child, and Sibling Relationships Among ResourcesA cluster service is an integrated entity that runs under the control of rgmanager . All resources in a service run on the same node. From the perspective of rgmanager , a cluster service is one entity that can be started, stopped, or relocated. Within a cluster service, however, the hierarchy of the resources determines the order in which each resource is started and stopped.The hierarchical levels consist of parent, child, and sibling. fs:myfs (<fs name="myfs" ...>) and ip:10.1.1.2 (<ip address="10.1.1.2 .../>) are siblings.
fs:myfs (<fs name="myfs" ...>) is the parent of script:script_child (<script name="script_child"/>).
script:script_child (<script name="script_child"/>) is the child of fs:myfs (<fs name="myfs" ...>).
Example C.1. Resource Hierarchy of Service foo <service name="foo" ...> <fs name="myfs" ...> <script name="script_child"/> </fs> <ip address="10.1.1.2" .../></service> The following rules apply to parent/child relationships in a resource tree: Parents are started before children. Children must all stop cleanly before a parent may be stopped. For a resource to be considered in good health, all its children must be in good health.
C.2. Sibling Start Ordering and Resource Child OrderingThe Service resource determines the start order and the stop order of a child resource according to whether it designates a child-type attribute for a child resource as follows: Designates child-type attribute (typed child resource) - If the Service resource designates a child-type attribute for a child resource, the child resource is typed. The child-type attribute explicitly determines the start and the stop order of the child resource. Does not designate child-type attribute (non-typed child resource) - If the Service resource does not designate a child-type attribute for a child resource, the child resource is non-typed. The Service resource does not explicitly control the starting order and stopping order of a non-typed child resource. However, a non-typed child resource is started and stopped according to its order in /etc/cluster/cluster.conf . In addition, non-typed child resources are started after all typed child resources have started and are stopped before any typed child resources have stopped.
The only resource to implement defined child resource type ordering is the Service resource. C.2.1. Typed Child Resource Start and Stop OrderingFor a typed child resource, the type attribute for the child resource defines the start order and the stop order of each resource type with a number that can range from 1 to 100; one value for start, and one value for stop. The lower the number, the earlier a resource type starts or stops. For example, Table C.1, "Child Resource Type Start and Stop Order" shows the start and stop values for each resource type; Example C.2, "Resource Start and Stop Values: Excerpt from Service Resource Agent, service.sh " shows the start and stop values as they appear in the Service resource agent, service.sh . For the Service resource, all LVM children are started first, followed by all File System children, followed by all Script children, and so forth. Table C.1. Child Resource Type Start and Stop Order Resource | Child Type | Start-order Value | Stop-order Value |
---|
LVM | lvm | 1 | 9 | File System | fs | 2 | 8 | GFS2 File System | clusterfs | 3 | 7 | NFS Mount | netfs | 4 | 6 | NFS Export | nfsexport | 5 | 5 | NFS Client | nfsclient | 6 | 4 | IP Address | ip | 7 | 2 | Samba | smb | 8 | 3 | Script | script | 9 | 1 |
Example C.2. Resource Start and Stop Values: Excerpt from Service Resource Agent, service.sh <special tag="rgmanager"> <attributes root="1" maxinstances="1"/> <child type="lvm" start="1" stop="9"/> <child type="fs" start="2" stop="8"/> <child type="clusterfs" start="3" stop="7"/> <child type="netfs" start="4" stop="6"/> <child type="nfsexport" start="5" stop="5"/> <child type="nfsclient" start="6" stop="4"/> <child type="ip" start="7" stop="2"/> <child type="smb" start="8" stop="3"/> <child type="script" start="9" stop="1"/></special> Ordering within a resource type is preserved as it exists in the cluster configuration file, /etc/cluster/cluster.conf . For example, consider the starting order and stopping order of the typed child resources in Example C.3, "Ordering Within a Resource Type". Example C.3. Ordering Within a Resource Type <service name="foo"> <script name="1" .../> <lvm name="1" .../> <ip address="10.1.1.1" .../> <fs name="1" .../> <lvm name="2" .../></service> Typed Child Resource Starting Orderlvm:1 - This is an LVM resource. All LVM resources are started first. lvm:1 (<lvm name="1" .../> ) is the first LVM resource started among LVM resources because it is the first LVM resource listed in the Service foo portion of /etc/cluster/cluster.conf .
lvm:2 - This is an LVM resource. All LVM resources are started first. lvm:2 (<lvm name="2" .../> ) is started after lvm:1 because it is listed after lvm:1 in the Service foo portion of /etc/cluster/cluster.conf .
fs:1 - This is a File System resource. If there were other File System resources in Service foo, they would start in the order listed in the Service foo portion of /etc/cluster/cluster.conf .
ip:10.1.1.1 - This is an IP Address resource. If there were other IP Address resources in Service foo, they would start in the order listed in the Service foo portion of /etc/cluster/cluster.conf .
script:1 - This is a Script resource. If there were other Script resources in Service foo, they would start in the order listed in the Service foo portion of /etc/cluster/cluster.conf .
Typed Child Resource Stopping Orderscript:1 - This is a Script resource. If there were other Script resources in Service foo, they would stop in the reverse order listed in the Service foo portion of /etc/cluster/cluster.conf .
ip:10.1.1.1 - This is an IP Address resource. If there were other IP Address resources in Service foo, they would stop in the reverse order listed in the Service foo portion of /etc/cluster/cluster.conf .
fs:1 - This is a File System resource. If there were other File System resources in Service foo, they would stop in the reverse order listed in the Service foo portion of /etc/cluster/cluster.conf .
lvm:2 - This is an LVM resource. All LVM resources are stopped last. lvm:2 (<lvm name="2" .../> ) is stopped before lvm:1 ; resources within a group of a resource type are stopped in the reverse order listed in the Service foo portion of /etc/cluster/cluster.conf .
lvm:1 - This is an LVM resource. All LVM resources are stopped last. lvm:1 (<lvm name="1" .../> ) is stopped after lvm:2 ; resources within a group of a resource type are stopped in the reverse order listed in the Service foo portion of /etc/cluster/cluster.conf .
C.2.2. Non-typed Child Resource Start and Stop OrderingAdditional considerations are required for non-typed child resources. For a non-typed child resource, starting order and stopping order are not explicitly specified by the Service resource. Instead, starting order and stopping order are determined according to the order of the child resource in /etc/cluster/cluster.conf . Additionally, non-typed child resources are started after all typed child resources and stopped before any typed child resources. Example C.4. Non-typed and Typed Child Resource in a Service <service name="foo"> <script name="1" .../> <nontypedresource name="foo"/> <lvm name="1" .../> <nontypedresourcetwo name="bar"/> <ip address="10.1.1.1" .../> <fs name="1" .../> <lvm name="2" .../></service> Non-typed Child Resource Starting Orderlvm:1 - This is an LVM resource. All LVM resources are started first. lvm:1 (<lvm name="1" .../> ) is the first LVM resource started among LVM resources because it is the first LVM resource listed in the Service foo portion of /etc/cluster/cluster.conf .
lvm:2 - This is an LVM resource. All LVM resources are started first. lvm:2 (<lvm name="2" .../> ) is started after lvm:1 because it is listed after lvm:1 in the Service foo portion of /etc/cluster/cluster.conf .
fs:1 - This is a File System resource. If there were other File System resources in Service foo, they would start in the order listed in the Service foo portion of /etc/cluster/cluster.conf .
ip:10.1.1.1 - This is an IP Address resource. If there were other IP Address resources in Service foo, they would start in the order listed in the Service foo portion of /etc/cluster/cluster.conf .
script:1 - This is a Script resource. If there were other Script resources in Service foo, they would start in the order listed in the Service foo portion of /etc/cluster/cluster.conf .
nontypedresource:foo - This is a non-typed resource. Because it is a non-typed resource, it is started after the typed resources start. In addition, its order in the Service resource is before the other non-typed resource, nontypedresourcetwo:bar ; therefore, it is started before nontypedresourcetwo:bar . (Non-typed resources are started in the order that they appear in the Service resource.)
nontypedresourcetwo:bar - This is a non-typed resource. Because it is a non-typed resource, it is started after the typed resources start. In addition, its order in the Service resource is after the other non-typed resource, nontypedresource:foo ; therefore, it is started after nontypedresource:foo . (Non-typed resources are started in the order that they appear in the Service resource.)
Non-typed Child Resource Stopping Ordernontypedresourcetwo:bar - This is a non-typed resource. Because it is a non-typed resource, it is stopped before the typed resources are stopped. In addition, its order in the Service resource is after the other non-typed resource, nontypedresource:foo ; therefore, it is stopped before nontypedresource:foo . (Non-typed resources are stopped in the reverse order that they appear in the Service resource.)
nontypedresource:foo - This is a non-typed resource. Because it is a non-typed resource, it is stopped before the typed resources are stopped. In addition, its order in the Service resource is before the other non-typed resource, nontypedresourcetwo:bar ; therefore, it is stopped after nontypedresourcetwo:bar . (Non-typed resources are stopped in the reverse order that they appear in the Service resource.)
script:1 - This is a Script resource. If there were other Script resources in Service foo, they would stop in the reverse order listed in the Service foo portion of /etc/cluster/cluster.conf .
ip:10.1.1.1 - This is an IP Address resource. If there were other IP Address resources in Service foo, they would stop in the reverse order listed in the Service foo portion of /etc/cluster/cluster.conf .
fs:1 - This is a File System resource. If there were other File System resources in Service foo, they would stop in the reverse order listed in the Service foo portion of /etc/cluster/cluster.conf .
lvm:2 - This is an LVM resource. All LVM resources are stopped last. lvm:2 (<lvm name="2" .../> ) is stopped before lvm:1 ; resources within a group of a resource type are stopped in the reverse order listed in the Service foo portion of /etc/cluster/cluster.conf .
lvm:1 - This is an LVM resource. All LVM resources are stopped last. lvm:1 (<lvm name="1" .../> ) is stopped after lvm:2 ; resources within a group of a resource type are stopped in the reverse order listed in the Service foo portion of /etc/cluster/cluster.conf .
C.3. Inheritance, the <resources> Block, and Reusing ResourcesExample C.5. NFS Service Set Up for Resource Reuse and Inheritance <resources> <nfsclient name="bob" target="bob.example.com" options="rw,no_root_squash"/> <nfsclient name="jim" target="jim.example.com" options="rw,no_root_squash"/> <nfsexport name="exports"/> </resources> <service name="foo"> <fs name="1" mountpoint="/mnt/foo" device="/dev/sdb1" fsid="12344"> <nfsexport ref="exports"> <!-- nfsexport's path and fsid attributes are inherited from the mountpoint & fsid attribute of the parent fs resource --> <nfsclient ref="bob"/> <!-- nfsclient's path is inherited from the mountpoint and the fsid is added to the options string during export --> <nfsclient ref="jim"/> </nfsexport> </fs> <fs name="2" mountpoint="/mnt/bar" device="/dev/sdb2" fsid="12345"> <nfsexport ref="exports"> <nfsclient ref="bob"/> <!-- Because all of the critical data for this resource is either defined in the resources block or inherited, we can reference it again! --> <nfsclient ref="jim"/> </nfsexport> </fs> <ip address="10.2.13.20"/> </service> If the service were flat (that is, with no parent/child relationships), it would need to be configured as follows: The service would need four nfsclient resources - one per file system (a total of two for file systems), and one per target machine (a total of two for target machines). The service would need to specify export path and file system ID to each nfsclient, which introduces chances for errors in the configuration.
In Example C.5, "NFS Service Set Up for Resource Reuse and Inheritance" however, the NFS client resources nfsclient:bob and nfsclient:jim are defined once; likewise, the NFS export resource nfsexport:exports is defined once. All the attributes needed by the resources are inherited from parent resources. Because the inherited attributes are dynamic (and do not conflict with one another), it is possible to reuse those resources - which is why they are defined in the resources block. It may not be practical to configure some resources in multiple places. For example, configuring a file system resource in multiple places can result in mounting one file system on two nodes, therefore causing problems. C.4. Failure Recovery and Independent SubtreesIn most enterprise environments, the normal course of action for failure recovery of a service is to restart the entire service if any component in the service fails. For example, in Example C.6, "Service foo Normal Failure Recovery", if any of the scripts defined in this service fail, the normal course of action is to restart (or relocate or disable, according to the service recovery policy) the service. However, in some circumstances certain parts of a service may be considered non-critical; it may be necessary to restart only part of the service in place before attempting normal recovery. To accomplish that, you can use the __independent_subtree attribute. For example, in Example C.7, "Service foo Failure Recovery with __independent_subtree Attribute", the __independent_subtree attribute is used to accomplish the following actions: If script:script_one fails, restart script:script_one, script:script_two, and script:script_three. If script:script_two fails, restart just script:script_two. If script:script_three fails, restart script:script_one, script:script_two, and script:script_three. If script:script_four fails, restart the whole service.
Example C.6. Service foo Normal Failure Recovery <service name="foo"> <script name="script_one" ...> <script name="script_two" .../> </script> <script name="script_three" .../></service> Example C.7. Service foo Failure Recovery with __independent_subtree Attribute <service name="foo"> <script name="script_one" __independent_subtree="1" ...> <script name="script_two" __independent_subtree="1" .../> <script name="script_three" .../> </script> <script name="script_four" .../></service> In some circumstances, if a component of a service fails you may want to disable only that component without disabling the entire service, to avoid affecting other services that use other components of that service. As of the Red Hat Enterprise Linux 6.1 release, you can accomplish that by using the __independent_subtree="2" attribute, which designates the independent subtree as non-critical. You may only use the non-critical flag on singly-referenced resources. The non-critical flag works with all resources at all levels of the resource tree, but should not be used at the top level when defining services or virtual machines. As of the Red Hat Enterprise Linux 6.1 release, you can set maximum restart and restart expirations on a per-node basis in the resource tree for independent subtrees. To set these thresholds, you can use the following attributes: __max_restarts configures the maximum number of tolerated restarts prior to giving up.
__restart_expire_time configures the amount of time, in seconds, after which a restart is no longer attempted.
C.5. Debugging and Testing Services and Resource OrderingYou can debug and test services and resource ordering with the rg_test utility. rg_test is a command-line utility provided by the rgmanager package that is run from a shell or a terminal (it is not available in Conga). Table C.2, "rg_test Utility Summary" summarizes the actions and syntax for the rg_test utility. Table C.2. rg_test Utility Summary Action | Syntax |
---|
Display the resource rules that rg_test understands. | rg_test rules | Test a configuration (and /usr/share/cluster) for errors or redundant resource agents. | rg_test test /etc/cluster/cluster.conf | Display the start and stop ordering of a service. | Display start order: rg_test noop /etc/cluster/cluster.conf start service servicename Display stop order: rg_test noop /etc/cluster/cluster.conf stop service servicename
| Explicitly start or stop a service. | Only do this on one node, and always disable the service in rgmanager first. Start a service: rg_test test /etc/cluster/cluster.conf start service servicename Stop a service: rg_test test /etc/cluster/cluster.conf stop service servicename
| Calculate and display the resource tree delta between two cluster.conf files. | rg_test delta cluster.conf file 1 cluster.conf file 2 For example: rg_test delta /etc/cluster/cluster.conf.bak /etc/cluster/cluster.conf
|
Cluster Service Resource Check and Failover TimeoutThis appendix describes how rgmanager monitors the status of cluster resources, and how to modify the status check interval. The appendix also describes the __enforce_timeouts service parameter, which indicates that a timeout for an operation should cause a service to fail. To fully comprehend the information in this appendix, you may require detailed understanding of resource agents and the cluster configuration file, /etc/cluster/cluster.conf . For a comprehensive list and description of cluster.conf elements and attributes, refer to the cluster schema at /usr/share/cluster/cluster.rng , and the annotated schema at /usr/share/doc/cman-X.Y.ZZ/cluster_conf.html (for example /usr/share/doc/cman-3.0.12/cluster_conf.html ). D.1. Modifying the Resource Status Check Intervalrgmanager checks the status of individual resources, not whole services. Every 10 seconds, rgmanager scans the resource tree, looking for resources that have passed their "status check" interval.
Each resource agent specifies the amount of time between periodic status checks. Each resource utilizes these timeout values unless explicitly overridden in the cluster.conf file using the special <action> tag: <action name="status" depth="*" interval="10" />
This tag is a special child of the resource itself in the cluster.conf file. For example, if you had a file system resource for which you wanted to override the status check interval you could specify the file system resource in the cluster.conf file as follows: <fs name="test" device="/dev/sdb3"> <action name="status" depth="*" interval="10" /> <nfsexport...> </nfsexport> </fs> Some agents provide multiple "depths" of checking. For example, a normal file system status check (depth 0) checks whether the file system is mounted in the correct place. A more intensive check is depth 10, which checks whether you can read a file from the file system. A status check of depth 20 checks whether you can write to the file system. In the example given here, the depth is set to * , which indicates that these values should be used for all depths. The result is that the test file system is checked at the highest-defined depth provided by the resource-agent (in this case, 20) every 10 seconds. D.2. Enforcing Resource TimeoutsThere is no timeout for starting, stopping, or failing over resources. Some resources take an indeterminately long amount of time to start or stop. Unfortunately, a failure to stop (including a timeout) renders the service inoperable (failed state). You can, if desired, turn on timeout enforcement on each resource in a service individually by adding __enforce_timeouts="1" to the reference in the cluster.conf file. The following example shows a cluster service that has been configured with the __enforce_timeouts attribute set for the netfs resource. With this attribute set, then if it takes more than 30 seconds to unmount the NFS file system during a recovery process the operation will time out, causing the service to enter the failed state. </screen><rm> <failoverdomains/> <resources> <netfs export="/nfstest" force_unmount="1" fstype="nfs" host="10.65.48.65" mountpoint="/data/nfstest" name="nfstest_data" options="rw,sync,soft"/> </resources> <service autostart="1" exclusive="0" name="nfs_client_test" recovery="relocate"> <netfs ref="nfstest_data" __enforce_timeouts="1"/> </service></rm> Command Line Tools SummaryTable E.1, "Command Line Tool Summary" summarizes preferred command-line tools for configuring and managing the High Availability Add-On. For more information about commands and variables, refer to the man page for each command-line tool. Table E.1. Command Line Tool Summary Command Line Tool | Used With | Purpose |
---|
ccs_config_dump - Cluster Configuration Dump Tool | Cluster Infrastructure | ccs_config_dump generates XML output of running configuration. The running configuration is, sometimes, different from the stored configuration on file because some subsystems store or set some default information into the configuration. Those values are generally not present on the on-disk version of the configuration but are required at runtime for the cluster to work properly. For more information about this tool, refer to the ccs_config_dump(8) man page. | ccs_config_validate - Cluster Configuration Validation Tool | Cluster Infrastructure | ccs_config_validate validates cluster.conf against the schema, cluster.rng (located in /usr/share/cluster/cluster.rng on each node). For more information about this tool, refer to the ccs_config_validate(8) man page. | clustat - Cluster Status Utility | High-availability Service Management Components | The clustat command displays the status of the cluster. It shows membership information, quorum view, and the state of all configured user services. For more information about this tool, refer to the clustat(8) man page. | clusvcadm - Cluster User Service Administration Utility | High-availability Service Management Components | The clusvcadm command allows you to enable, disable, relocate, and restart high-availability services in a cluster. For more information about this tool, refer to the clusvcadm(8) man page. | cman_tool - Cluster Management Tool | Cluster Infrastructure | cman_tool is a program that manages the CMAN cluster manager. It provides the capability to join a cluster, leave a cluster, kill a node, or change the expected quorum votes of a node in a cluster. For more information about this tool, refer to the cman_tool(8) man page. | fence_tool - Fence Tool | Cluster Infrastructure | fence_tool is a program used to join and leave the fence domain. For more information about this tool, refer to the fence_tool(8) man page. |
High Availability LVM (HA-LVM)The Red Hat High Availability Add-On provides support for high availability LVM volumes (HA-LVM) in a failover configuration. This is distinct from active/active configurations enabled by the Clustered Logical Volume Manager (CLVM), which is a set of clustering extensions to LVM that allow a cluster of computers to manage shared storage. When to use CLVM or HA-LVM should be based on the needs of the applications or services being deployed. If the applications are cluster-aware and have been tuned to run simultaneously on multiple machines at a time, then CLVM should be used. Specifically, if more than one node of your cluster will require access to your storage which is then shared among the active nodes, then you must use CLVM. CLVM allows a user to configure logical volumes on shared storage by locking access to physical storage while a logical volume is being configured, and uses clustered locking services to manage the shared storage. For information on CLVM, and on LVM configuration in general, refer to Logical Volume Manager Administration. If the applications run optimally in active/passive (failover) configurations where only a single node that accesses the storage is active at any one time, you should use High Availability Logical Volume Management agents (HA-LVM).
Most applications will run better in an active/passive configuration, as they are not designed or optimized to run concurrently with other instances. Choosing to run an application that is not cluster-aware on clustered logical volumes may result in degraded performance if the logical volume is mirrored. This is because there is cluster communication overhead for the logical volumes themselves in these instances. A cluster-aware application must be able to achieve performance gains above the performance losses introduced by cluster file systems and cluster-aware logical volumes. This is achievable for some applications and workloads more easily than others. Determining what the requirements of the cluster are and whether the extra effort toward optimizing for an active/active cluster will pay dividends is the way to choose between the two LVM variants. Most users will achieve the best HA results from using HA-LVM. HA-LVM and CLVM are similar in the fact that they prevent corruption of LVM metadata and its logical volumes, which could otherwise occur if multiple machines where allowed to make overlapping changes. HA-LVM imposes the restriction that a logical volume can only be activated exclusively; that is, active on only one machine at a time. This means that only local (non-clustered) implementations of the storage drivers are used. Avoiding the cluster coordination overhead in this way increases performance. CLVM does not impose these restrictions - a user is free to activate a logical volume on all machines in a cluster; this forces the use of cluster-aware storage drivers, which allow for cluster-aware file systems and applications to be put on top. HA-LVM can be setup to use one of two methods for achieving its mandate of exclusive logical volume activation. The preferred method uses CLVM, but it will only ever activate the logical volumes exclusively. This has the advantage of easier setup and better prevention of administrative mistakes (like removing a logical volume that is in use). In order to use CLVM, the High Availability Add-On and Resilient Storage Add-On software, including the clvmd daemon, must be running. The second method uses local machine locking and LVM "tags". This method has the advantage of not requiring any LVM cluster packages; however, there are more steps involved in setting it up and it does not prevent an administrator from mistakenly removing a logical volume from a node in the cluster where it is not active. The procedure for configuring HA-LVM using this method is described in Section F.2, "Configuring HA-LVM Failover with Tagging".
F.1. Configuring HA-LVM Failover with CLVM (preferred)To set up HA-LVM failover (using the preferred CLVM variant), perform the following steps: Ensure that your system is configured to support CLVM, which requires the following: The High Availability Add-On and Resilient Storage Add-On are installed, including the the cmirror package if the CLVM logical volumes are to be mirrored. The locking_type parameter in the global section of the /etc/lvm/lvm.conf file is set to the value '3'. The High Availability Add-On and Resilient Storage Add-On software, including the clvmd daemon, must be running. For CLVM mirroring, the cmirrord service must be started as well.
Create the logical volume and file system using standard LVM and file system commands, as in the following example. # pvcreate /dev/sd[cde]1 # vgcreate -cy shared_vg /dev/sd[cde]1 # lvcreate -L 10G -n ha_lv shared_vg # mkfs.ext4 /dev/shared_vg/ha_lv # lvchange -an shared_vg/ha_lv For information on creating LVM logical volumes, refer to Logical Volume Manager Administration. Edit the /etc/cluster/cluster.conf file to include the newly created logical volume as a resource in one of your services. Alternately, you can use Conga or the ccs command to configure LVM and file system resources for the cluster. The following is a sample resource manager section from the /etc/cluster/cluster.conf file that configures a CLVM logical volume as a cluster resource: <rm> <failoverdomains> <failoverdomain name="FD" ordered="1" restricted="0"> <failoverdomainnode name="neo-01" priority="1"/> <failoverdomainnode name="neo-02" priority="2"/> </failoverdomain> </failoverdomains> <resources> <lvm name="lvm" vg_name="shared_vg" lv_name="ha-lv"/> <fs name="FS" device="/dev/shared_vg/ha-lv" force_fsck="0" force_unmount="1" fsid="64050" fstype="ext4" mountpoint="/mnt" options="" self_fence="0"/> </resources> <service autostart="1" domain="FD" name="serv" recovery="relocate"> <lvm ref="lvm"/> <fs ref="FS"/> </service></rm>
F.2. Configuring HA-LVM Failover with TaggingTo set up HA-LVM failover by using tags in the /etc/lvm/lvm.conf file, perform the following steps: Ensure that the locking_type parameter in the global section of the /etc/lvm/lvm.conf file is set to the value '1'. Create the logical volume and file system using standard LVM and file system commands, as in the following example. # pvcreate /dev/sd[cde]1 # vgcreate shared_vg /dev/sd[cde]1 # lvcreate -L 10G -n ha_lv shared_vg # mkfs.ext4 /dev/shared_vg/ha_lv For information on creating LVM logical volumes, refer to Logical Volume Manager Administration. Edit the /etc/cluster/cluster.conf file to include the newly created logical volume as a resource in one of your services. Alternately, you can use Conga or the ccs command to configure LVM and file system resources for the cluster. The following is a sample resource manager section from the /etc/cluster/cluster.conf file that configures a CLVM logical volume as a cluster resource: <rm> <failoverdomains> <failoverdomain name="FD" ordered="1" restricted="0"> <failoverdomainnode name="neo-01" priority="1"/> <failoverdomainnode name="neo-02" priority="2"/> </failoverdomain> </failoverdomains> <resources> <lvm name="lvm" vg_name="shared_vg" lv_name="ha_lv"/> <fs name="FS" device="/dev/shared_vg/ha_lv" force_fsck="0" force_unmount="1" fsid="64050" fstype="ext4" mountpoint="/mnt" options="" self_fence="0"/> </resources> <service autostart="1" domain="FD" name="serv" recovery="relocate"> <lvm ref="lvm"/> <fs ref="FS"/> </service></rm> If there are multiple logical volumes in the volume group, then the logical volume name (lv_name ) in the lvm resource should be left blank or unspecified. Also note that in an HA-LVM configuration, a volume group may be used by only a single service. Edit the volume_list field in the /etc/lvm/lvm.conf file. Include the name of your root volume group and your hostname as listed in the /etc/cluster/cluster.conf file preceded by @. The hostname to include here is the machine on which you are editing the lvm.conf file, not any remote hostname. Note that this string MUST match the node name given in the cluster.conf file. Below is a sample entry from the /etc/lvm/lvm.conf file: volume_list = [ "VolGroup00", "@neo-01" ] This tag will be used to activate shared VGs or LVs. DO NOT include the names of any volume groups that are to be shared using HA-LVM. Update the initrd device on all your cluster nodes: # dracut -H -f /boot/initramfs-$(uname -r).img $(uname -r) Reboot all nodes to ensure the correct initrd device is in use.
Revision History |
---|
Revision 5.0-25 | Mon Feb 18 2013 | Steven Levine | Version for 6.4 GA release |
| Revision 5.0-23 | Wed Jan 30 2013 | Steven Levine | Resolves: 901641 | Corrects and clarifies iptables rules. |
| Revision 5.0-22 | Tue Jan 29 2013 | Steven Levine | Resolves: 788636 | Documents RRP configuration through ccs command. | Resolves: 789010 | Document RRP configurtion in the cluster.conf file. |
| Revision 5.0-20 | Fri Jan 18 2013 | Steven Levine | Resolves: 894097 | Removes advice to ensure you are not using VLAN tagging. | Resolves: 845365 | Indicates that bonding modes 0 and 2 are now supported. |
| Revision 5.0-19 | Thu Jan 17 2013 | Steven Levine | Resolves: 896234 | Clarifies terminology of cluster node references. |
| Revision 5.0-16 | Mon Nov 26 2012 | Steven Levine | Version for 6.4 Beta release |
| Revision 5.0-15 | Wed Nov 20 2012 | Steven Levine | Resolves: 838988 | Documents nfsrestart attribute for file system resource agents. | Resolves: 843169 | Documents IBM iPDU fence agent. | Resolves: 846121 | Documents Eaton Network Power Controller (SNMP Interface) fence agent. | Resolves: 856834 | Documents HP Bladesystem fence agent. | Resolves: 865313 | Documents NFS Server resource agent. | Resolves: 862281 | Clarifies which ccs commands overwrite previous settings. | Resolves: 846205 | Documents iptables firewall filtering for igmp component. | Resolves: 857172 | Documents ability to remove users from luci. | Resolves: 857165 | Documents the privilege level parameter of the IPMI fence agent. | Resolves: 840912 | Clears up formatting issue with resource parameter table. | Resolves: 849240, 870292 | Clarifies installation procedure. | Resolves: 871165 | Clarifies description of IP address parameter in description of IP address resource agent. | Resolves: 845333, 869039, 856681 | Fixes small typographical errors and clarifies small technical ambiguities. |
| Revision 5.0-12 | Thu Nov 1 2012 | Steven Levine | Added newly-supported fence agents. |
| Revision 5.0-7 | Thu Oct 25 2012 | Steven Levine | Added section on override semantics. |
| Revision 5.0-6 | Tue Oct 23 2012 | Steven Levine | Fixed default value of Post Join Delay. |
| Revision 5.0-4 | Tue Oct 16 2012 | Steven Levine | Added description of NFS server resource. |
| Revision 5.0-2 | Thu Oct 11 2012 | Steven Levine | Updates to Conga descriptions. |
| Revision 5.0-1 | Mon Oct 8 2012 | Steven Levine | | Revision 4.0-5 | Fri Jun 15 2012 | Steven Levine | Version for 6.3 GA release |
| Revision 4.0-4 | Tue Jun 12 2012 | Steven Levine | Resolves: 830148 | Ensures consistency of port number examples for luci. |
| Revision 4.0-3 | Tue May 21 2012 | Steven Levine | Resolves: 696897 | Adds cluster.conf parameter information to tables of fence device parameters and resource parameters. | Resolves: 811643 | Adds procedure for restoring a luci database on a separate machine. |
| Revision 4.0-2 | Wed Apr 25 2012 | Steven Levine | Resolves: 815619 | Removes warning about using UDP Unicast with GFS2 file systems. |
| Revision 4.0-1 | Fri Mar 30 2012 | Steven Levine | Resolves: 771447, 800069, 800061 | Updates documentation of luci to be consistent with Red Hat Enterprise Linux 6.3 version. | Resolves: 712393 | Adds information on capturing an application core for RGManager. | Resolves: 800074 | Documents condor resource agent. | Resolves: 757904 | Documents luci configuration backup and restore. | Resolves: 772374 | Adds section on managing virtual machines in a cluster. | Resolves: 712378 | Adds documentation for HA-LVM configuration. | Resolves: 712400 | Documents debug options. | Resolves: 751156 | Documents new fence_ipmilan parameter. | Resolves: 721373 | Documents which configuration changes require a cluster restart. |
| Revision 3.0-5 | Thu Dec 1 2011 | Steven Levine | Release for GA of Red Hat Enterprise Linux 6.2 | Resolves: 755849 | Corrects monitor_link parameter example. |
| Revision 3.0-4 | Mon Nov 7 2011 | Steven Levine | Resolves: 749857 | Adds documentation for RHEV-M REST API fence device. |
| Revision 3.0-3 | Fri Oct 21 2011 | Steven Levine | Resolves: #747181, #747182, #747184, #747185, #747186, #747187, #747188, #747189, #747190, #747192 | Corrects typographical errors and ambiguities found during documentation QE review for Red Hat Enterprise Linux 6.2. |
| Revision 3.0-2 | Fri Oct 7 2011 | Steven Levine | Resolves: #743757 | Corrects reference to supported bonding mode in troubleshooting section. |
| Revision 3.0-1 | Wed Sep 28 2011 | Steven Levine | Initial revision for Red Hat Enterprise Linux 6.2 Beta release | Resolves: #739613 | Documents support for new ccs options to display available fence devices and available services. | Resolves: #707740 | Documents updates to the Conga interface and documents support for setting user permissions to administer Conga. | Resolves: #731856 | Documents supports for configuring luci by means of the /etc/sysconfig/luci file. | Resolves: #736134 | Documents support for UDPU transport. | Resolves: #736143 | Documents support for clustered Samba. | Resolves: #617634 | Documents how to configure the only IP address luci is served at. | Resolves: #713259 | Documents support for fence_vmware_soap agent. | Resolves: #721009 | Provides link to Support Essentials article. | Resolves: #717006 | Provides information on allowing multicast traffic through the iptables firewall. | Resolves: #717008 | Provides information about cluster service status check and failover timeout. | Resolves: #711868 | Clarifies description of autostart. | Resolves: #728337 | Documents procedure for adding vm resources with the ccs command. | Resolves: #725315, #733011, #733074, #733689 | Corrects small typographical errors. |
| Revision 2.0-1 | Thu May 19 2011 | Steven Levine | Initial revision for Red Hat Enterprise Linux 6.1 | Resolves: #671250 | Documents support for SNMP traps. | Resolves: #659753 | Documents ccs command. | Resolves: #665055 | Updates Conga documentation to reflect updated display and feature support. | Resolves: #680294 | Documents need for password access for ricci agent. | Resolves: #687871 | Adds chapter on troubleshooting. | Resolves: #673217 | Fixes typographical error. | Resolves: #675805 | Adds reference to cluster.conf schema to tables of HA resource parameters. | Resolves: #672697 | Updates tables of fence device parameters to include all currently supported fencing devices. | Resolves: #677994 | Corrects information for fence_ilo fence agent parameters. | Resolves: #629471 | Adds technical note about setting consensus value in a two-node cluster. | Resolves: #579585 | Updates section on upgrading Red Hat High Availability Add-On Software. | Resolves: #643216 | Clarifies small issues throughout document. | Resolves: #643191 | Provides improvements and corrections for the luci documentation. | Resolves: #704539 | Updates the table of Virtual Machine resource parameters. |
| Revision 1.0-1 | Wed Nov 10 2010 | Paul Kennedy | Initial release for Red Hat Enterprise Linux 6 |
|
C- CISCO MDS fence device , Fence Device Parameters
- Cisco UCS fence device , Fence Device Parameters
- cluster
- administration, Before Configuring the Red Hat High Availability Add-On, Managing Red Hat High Availability Add-On With Conga, Managing Red Hat High Availability Add-On With ccs, Managing Red Hat High Availability Add-On With Command Line Tools
- diagnosing and correcting problems, Diagnosing and Correcting Problems in a Cluster, Diagnosing and Correcting Problems in a Cluster
- starting, stopping, restarting, Starting and Stopping the Cluster Software
- cluster administration, Before Configuring the Red Hat High Availability Add-On, Managing Red Hat High Availability Add-On With Conga, Managing Red Hat High Availability Add-On With ccs, Managing Red Hat High Availability Add-On With Command Line Tools
- adding cluster node, Adding a Member to a Running Cluster, Adding a Member to a Running Cluster
- compatible hardware, Compatible Hardware
- configuration validation, Configuration Validation
- configuring ACPI, Configuring ACPI For Use with Integrated Fence Devices
- configuring iptables, Enabling IP Ports
- considerations for using qdisk, Considerations for Using Quorum Disk
- considerations for using quorum disk, Considerations for Using Quorum Disk
- deleting a cluster, Starting, Stopping, Restarting, and Deleting Clusters
- deleting a node from the configuration; adding a node to the configuration , Deleting or Adding a Node
- diagnosing and correcting problems in a cluster, Diagnosing and Correcting Problems in a Cluster, Diagnosing and Correcting Problems in a Cluster
- displaying HA services with clustat, Displaying HA Service Status with clustat
- enabling IP ports, Enabling IP Ports
- general considerations, General Configuration Considerations
- joining a cluster, Causing a Node to Leave or Join a Cluster, Causing a Node to Leave or Join a Cluster
- leaving a cluster, Causing a Node to Leave or Join a Cluster, Causing a Node to Leave or Join a Cluster
- managing cluster node, Managing Cluster Nodes, Managing Cluster Nodes
- managing high-availability services, Managing High-Availability Services, Managing High-Availability Services
- managing high-availability services, freeze and unfreeze, Managing HA Services with clusvcadm, Considerations for Using the Freeze and Unfreeze Operations
- network switches and multicast addresses, Multicast Addresses
- NetworkManager, Considerations for NetworkManager
- rebooting cluster node, Rebooting a Cluster Node
- removing cluster node, Deleting a Member from a Cluster
- restarting a cluster, Starting, Stopping, Restarting, and Deleting Clusters
- ricci considerations, Considerations for ricci
- SELinux, Red Hat High Availability Add-On and SELinux
- starting a cluster, Starting, Stopping, Restarting, and Deleting Clusters, Starting and Stopping a Cluster
- starting, stopping, restarting a cluster, Starting and Stopping the Cluster Software
- stopping a cluster, Starting, Stopping, Restarting, and Deleting Clusters, Starting and Stopping a Cluster
- updating a cluster configuration using cman_tool version -r, Updating a Configuration Using cman_tool version -r
- updating a cluster configuration using scp, Updating a Configuration Using scp
- updating configuration, Updating a Configuration
- virtual machines, Configuring Virtual Machines in a Clustered Environment
- cluster configuration, Configuring Red Hat High Availability Add-On With Conga, Configuring Red Hat High Availability Add-On With the ccs Command, Configuring Red Hat High Availability Add-On With Command Line Tools
- deleting or adding a node, Deleting or Adding a Node
- updating, Updating a Configuration
- cluster resource relationships, Parent, Child, and Sibling Relationships Among Resources
- cluster resource status check, Cluster Service Resource Check and Failover Timeout
- cluster resource types, Considerations for Configuring HA Services
- cluster service managers
- configuration, Adding a Cluster Service to the Cluster, Adding a Cluster Service to the Cluster, Adding a Cluster Service to the Cluster
- cluster services, Adding a Cluster Service to the Cluster, Adding a Cluster Service to the Cluster, Adding a Cluster Service to the Cluster
- (see also adding to the cluster configuration)
- cluster software
- configuration, Configuring Red Hat High Availability Add-On With Conga, Configuring Red Hat High Availability Add-On With the ccs Command, Configuring Red Hat High Availability Add-On With Command Line Tools
- configuration
- HA service, Considerations for Configuring HA Services
- Configuring High Availability LVM, High Availability LVM (HA-LVM)
- Conga
- accessing, Configuring Red Hat High Availability Add-On Software
- consensus value, The consensus Value for totem in a Two-Node Cluster
F- failover timeout, Cluster Service Resource Check and Failover Timeout
- features, new and changed, New and Changed Features
- feedback, Feedback
- fence agent
- fence_apc, Fence Device Parameters
- fence_apc_snmp, Fence Device Parameters
- fence_bladecenter, Fence Device Parameters
- fence_brocade, Fence Device Parameters
- fence_cisco_mds, Fence Device Parameters
- fence_cisco_ucs, Fence Device Parameters
- fence_drac5, Fence Device Parameters
- fence_eaton_snmp, Fence Device Parameters
- fence_egenera, Fence Device Parameters
- fence_eps, Fence Device Parameters
- fence_hpblade, Fence Device Parameters
- fence_ibmblade, Fence Device Parameters
- fence_ifmib, Fence Device Parameters
- fence_ilo, Fence Device Parameters
- fence_ilo_mp, Fence Device Parameters
- fence_intelmodular, Fence Device Parameters
- fence_ipdu, Fence Device Parameters
- fence_ipmilan, Fence Device Parameters
- fence_rhevm, Fence Device Parameters
- fence_rsb, Fence Device Parameters
- fence_scsi, Fence Device Parameters
- fence_virt, Fence Device Parameters
- fence_vmware_soap, Fence Device Parameters
- fence_wti, Fence Device Parameters
- fence device
- APC power switch over SNMP, Fence Device Parameters
- APC power switch over telnet/SSH, Fence Device Parameters
- Brocade fabric switch, Fence Device Parameters
- Cisco MDS, Fence Device Parameters
- Cisco UCS, Fence Device Parameters
- Dell DRAC 5, Fence Device Parameters
- Eaton network power switch, Fence Device Parameters
- Egenera SAN controller, Fence Device Parameters
- ePowerSwitch, Fence Device Parameters
- Fence virt, Fence Device Parameters
- Fujitsu Siemens Remoteview Service Board (RSB), Fence Device Parameters
- HP BladeSystem, Fence Device Parameters
- HP iLO MP, Fence Device Parameters
- HP iLO/iLO2, Fence Device Parameters
- IBM BladeCenter, Fence Device Parameters
- IBM BladeCenter SNMP, Fence Device Parameters
- IBM iPDU, Fence Device Parameters
- IF MIB, Fence Device Parameters
- Intel Modular, Fence Device Parameters
- IPMI LAN, Fence Device Parameters
- RHEV-M REST API, Fence Device Parameters
- SCSI fencing, Fence Device Parameters
- VMware (SOAP interface), Fence Device Parameters
- WTI power switch, Fence Device Parameters
- Fence virt fence device , Fence Device Parameters
- fence_apc fence agent, Fence Device Parameters
- fence_apc_snmp fence agent, Fence Device Parameters
- fence_bladecenter fence agent, Fence Device Parameters
- fence_brocade fence agent, Fence Device Parameters
- fence_cisco_mds fence agent, Fence Device Parameters
- fence_cisco_ucs fence agent, Fence Device Parameters
- fence_drac5 fence agent, Fence Device Parameters
- fence_eaton_snmp fence agent, Fence Device Parameters
- fence_egenera fence agent, Fence Device Parameters
- fence_eps fence agent, Fence Device Parameters
- fence_hpblade fence agent, Fence Device Parameters
- fence_ibmblade fence agent, Fence Device Parameters
- fence_ifmib fence agent, Fence Device Parameters
- fence_ilo fence agent, Fence Device Parameters
- fence_ilo_mp fence agent, Fence Device Parameters
- fence_intelmodular fence agent, Fence Device Parameters
- fence_ipdu fence agent, Fence Device Parameters
- fence_ipmilan fence agent, Fence Device Parameters
- fence_rhevm fence agent, Fence Device Parameters
- fence_rsb fence agent, Fence Device Parameters
- fence_scsi fence agent, Fence Device Parameters
- fence_virt fence agent, Fence Device Parameters
- fence_vmware_soap fence agent, Fence Device Parameters
- fence_wti fence agent, Fence Device Parameters
- Fujitsu Siemens Remoteview Service Board (RSB) fence device, Fence Device Parameters
I- IBM BladeCenter fence device , Fence Device Parameters
- IBM BladeCenter SNMP fence device , Fence Device Parameters
- IBM iPDU fence device , Fence Device Parameters
- IF MIB fence device , Fence Device Parameters
- integrated fence devices
- configuring ACPI, Configuring ACPI For Use with Integrated Fence Devices
- Intel Modular fence device , Fence Device Parameters
- introduction, Introduction
- other Red Hat Enterprise Linux documents, Introduction
- IP ports
- enabling, Enabling IP Ports
- IPMI LAN fence device , Fence Device Parameters
- iptables
- configuring, Enabling IP Ports
- iptables firewall, Configuring the iptables Firewall to Allow Cluster Components
T- tables
- fence devices, parameters, Fence Device Parameters
- HA resources, parameters, HA Resource Parameters
- timeout failover, Cluster Service Resource Check and Failover Timeout
- tools, command line, Command Line Tools Summary
- totem tag
- consensus value, The consensus Value for totem in a Two-Node Cluster
- troubleshooting
- diagnosing and correcting problems in a cluster, Diagnosing and Correcting Problems in a Cluster, Diagnosing and Correcting Problems in a Cluster
- types
- cluster resource, Considerations for Configuring HA Services
|
| |