Chapter 7. Configuring Red Hat High Availability Add-On With Command Line Tools

This chapter describes how to configure Red Hat High Availability Add-On software by directly editing the cluster configuration file (/etc/cluster/cluster.conf) and using command-line tools. The chapter provides procedures about building a configuration file one section at a time, starting with a sample file provided in the chapter. As an alternative to starting with a sample file provided here, you could copy a skeleton configuration file from the cluster.conf man page. However, doing so would not necessarily align with information provided in subsequent procedures in this chapter. There are other ways to create and configure a cluster configuration file; this chapter provides procedures about building a configuration file one section at a time. Also, keep in mind that this is just a starting point for developing a configuration file to suit your clustering needs.

This chapter consists of the following sections:

Important

Make sure that your deployment of High Availability Add-On meets your needs and can be supported. Consult with an authorized Red Hat representative to verify your configuration prior to deployment. In addition, allow time for a configuration burn-in period to test failure modes.

Important

This chapter references commonly used cluster.conf elements and attributes. For a comprehensive list and description of cluster.conf elements and attributes, refer to the cluster schema at /usr/share/cluster/cluster.rng, and the annotated schema at /usr/share/doc/cman-X.Y.ZZ/cluster_conf.html (for example /usr/share/doc/cman-3.0.12/cluster_conf.html).

Important

Certain procedure in this chapter call for using the cman_tool version -r command to propagate a cluster configuration throughout a cluster. Using that command requires that ricci is running. Using ricci requires a password the first time you interact with ricci from any specific machine. For information on the ricci service, refer to Section 2.13, "Considerations for ricci".

Note

Procedures in this chapter may include specific commands for some of the command-line tools listed in Appendix E, Command Line Tools Summary. For more information about all commands and variables, refer to the man page for each command-line tool.

7.1. Configuration Tasks

Configuring Red Hat High Availability Add-On software with command-line tools consists of the following steps:

Creating a cluster. Refer to Section 7.2, "Creating a Basic Cluster Configuration File".
Configuring fencing. Refer to Section 7.3, "Configuring Fencing".
Configuring failover domains. Refer to Section 7.4, "Configuring Failover Domains".
Configuring HA services. Refer to Section 7.5, "Configuring HA Services".
Verifying a configuration. Refer to Section 7.8, "Verifying a Configuration".

7.2. Creating a Basic Cluster Configuration File

Provided that cluster hardware, Red Hat Enterprise Linux, and High Availability Add-On software are installed, you can create a cluster configuration file (/etc/cluster/cluster.conf) and start running the High Availability Add-On. As a starting point only, this section describes how to create a skeleton cluster configuration file without fencing, failover domains, and HA services. Subsequent sections describe how to configure those parts of the configuration file.

Important

This is just an interim step to create a cluster configuration file; the resultant file does not have any fencing and is not considered to be a supported configuration.

The following steps describe how to create and configure a skeleton cluster configuration file. Ultimately, the configuration file for your cluster will vary according to the number of nodes, the type of fencing, the type and number of HA services, and other site-specific requirements.

At any node in the cluster, create /etc/cluster/cluster.conf, using the template of the example in Example 7.1, "cluster.conf Sample: Basic Configuration".
(Optional) If you are configuring a two-node cluster, you can add the following line to the configuration file to allow a single node to maintain quorum (for example, if one node fails):
<cman two_node="1" expected_votes="1"/>
When you add or remove the two_node option from the cluster.conf file, you must restart the cluster for this change to take effect when you update the configuration. For information on updating a cluster configuration, refer to Section 8.4, "Updating a Configuration". For an example of specifying the two_node option, refer to Example 7.2, "cluster.conf Sample: Basic Two-Node Configuration".
Specify the cluster name and the configuration version number using the cluster attributes: name and config_version (refer to Example 7.1, "cluster.conf Sample: Basic Configuration" or Example 7.2, "cluster.conf Sample: Basic Two-Node Configuration").
In the clusternodes section, specify the node name and the node ID of each node using the clusternode attributes: name and nodeid.
Save /etc/cluster/cluster.conf.
Validate the file with against the cluster schema (cluster.rng) by running the ccs_config_validate command. For example:
```
[root@example-01 ~]# ccs_config_validate Configuration validates
```
Propagate the configuration file to /etc/cluster/ in each cluster node. For example, you could propagate the file to other cluster nodes using the scp command.
Note
Propagating the cluster configuration file this way is necessary the first time a cluster is created. Once a cluster is installed and running, the cluster configuration file can be propagated using the cman_tool version -r. It is possible to use the scp command to propagate an updated configuration file; however, the cluster software must be stopped on all nodes while using the scp command.In addition, you should run the ccs_config_validate if you propagate an updated configuration file via the scp.
Note
While there are other elements and attributes present in the sample configuration file (for example, fence and fencedevices), there is no need to populate them now. Subsequent procedures in this chapter provide information about specifying other elements and attributes.

Start the cluster. At each cluster node run the following command:

service cman start

For example:

[root@example-01 ~]# service cman startStarting cluster: Checking Network Manager... [  OK  ]   Global setup... [  OK  ]   Loading kernel modules...   [  OK  ]   Mounting configfs... [  OK  ]   Starting cman... [  OK  ]   Waiting for quorum...   [  OK  ]   Starting fenced...  [  OK  ]   Starting dlm_controld... [  OK  ]   Starting gfs_controld... [  OK  ]   Unfencing self...   [  OK  ]   Joining fence domain... [  OK  ]

At any cluster node, run cman_tool nodes to verify that the nodes are functioning as members in the cluster (signified as "M" in the status column, "Sts"). For example:

[root@example-01 ~]# cman_tool nodesNode  Sts   Inc   Joined   Name   1   M 548   2010-09-28 10:52:21  node-01.example.com   2   M 548   2010-09-28 10:52:21  node-02.example.com   3   M 544   2010-09-28 10:52:21  node-03.example.com

If the cluster is running, proceed to Section 7.3, "Configuring Fencing".

Basic Configuration Examples

Example 7.1, "cluster.conf Sample: Basic Configuration" and Example 7.2, "cluster.conf Sample: Basic Two-Node Configuration" (for a two-node cluster) each provide a very basic sample cluster configuration file as a starting point. Subsequent procedures in this chapter provide information about configuring fencing and HA services.

Example 7.1. cluster.conf Sample: Basic Configuration

<cluster name="mycluster" config_version="2">   <clusternodes> <clusternode name="node-01.example.com" nodeid="1"> <fence> </fence> </clusternode> <clusternode name="node-02.example.com" nodeid="2"> <fence> </fence> </clusternode> <clusternode name="node-03.example.com" nodeid="3"> <fence> </fence> </clusternode>   </clusternodes>   <fencedevices>   </fencedevices>   <rm>   </rm></cluster>

Example 7.2. cluster.conf Sample: Basic Two-Node Configuration

<cluster name="mycluster" config_version="2">   <cman two_node="1" expected_votes="1"/>   <clusternodes> <clusternode name="node-01.example.com" nodeid="1"> <fence> </fence> </clusternode> <clusternode name="node-02.example.com" nodeid="2"> <fence> </fence> </clusternode>   </clusternodes>   <fencedevices>   </fencedevices>   <rm>   </rm></cluster>

The `consensus` Value for `totem` in a Two-Node Cluster

When you create a two-node cluster and you do not intend to add additional nodes to the cluster at a later time, then you should omit the consensus value in the totem tag in the cluster.conf file so that the consensus value is calculated automatically. When the consensus value is calculated automatically, the following rules are used:

If there are two nodes or fewer, the consensus value will be (token * 0.2), with a ceiling of 2000 msec and a floor of 200 msec.
If there are three or more nodes, the consensus value will be (token + 2000 msec)

If you let the cman utility configure your consensus timeout in this fashion, then moving at a later time from two to three (or more) nodes will require a cluster restart, since the consensus timeout will need to change to the larger value based on the token timeout.

If you are configuring a two-node cluster and intend to upgrade in the future to more than two nodes, you can override the consensus timeout so that a cluster restart is not required when moving from two to three (or more) nodes. This can be done in the cluster.conf as follows:

<totem token="X" consensus="X + 2000" />

Note that the configuration parser does not calculate X + 2000 automatically. An integer value must be used rather than an equation.

The advantage of using the optimized consensus timeout for two-node clusters is that overall failover time is reduced for the two-node case, since consensus is not a function of the token timeout.

Note that for two-node autodetection in cman, the number of physical nodes is what matters and not the presence of the two_node=1 directive in the cluster.conf file.

7.3. Configuring Fencing

Configuring fencing consists of (a) specifying one or more fence devices in a cluster and (b) specifying one or more fence methods for each node (using a fence device or fence devices specified).

Based on the type of fence devices and fence methods required for your configuration, configure cluster.conf as follows:

In the fencedevices section, specify each fence device, using a fencedevice element and fence-device dependent attributes. Example 7.3, "APC Fence Device Added to cluster.conf " shows an example of a configuration file with an APC fence device added to it.
At the clusternodes section, within the fence element of each clusternode section, specify each fence method of the node. Specify the fence method name, using the method attribute, name. Specify the fence device for each fence method, using the device element and its attributes, name and fence-device-specific parameters. Example 7.4, "Fence Methods Added to cluster.conf " shows an example of a fence method with one fence device for each node in the cluster.
For non-power fence methods (that is, SAN/storage fencing), at the clusternodes section, add an unfence section. This ensures that a fenced node is not re-enabled until the node has been rebooted. For more information about unfencing a node, refer to the fence_node(8) man page.
The unfence section does not contain method sections like the fence section does. It contains device references directly, which mirror the corresponding device sections for fence, with the notable addition of the explicit action (action) of "on" or "enable". The same fencedevice is referenced by both fence and unfence device lines, and the same per-node arguments should be repeated.
Specifying the action attribute as "on" or "enable" enables the node when rebooted. Example 7.4, "Fence Methods Added to cluster.conf " and Example 7.5, "cluster.conf: Multiple Fence Methods per Node" include examples of the unfence elements and attributed.
For more information about unfence refer to the fence_node man page.
Update the config_version attribute by incrementing its value (for example, changing from config_version="2" to config_version="3">).
Save /etc/cluster/cluster.conf.
(Optional) Validate the updated file against the cluster schema (cluster.rng) by running the ccs_config_validate command. For example:
```
[root@example-01 ~]# ccs_config_validate Configuration validates
```
Run the cman_tool version -r command to propagate the configuration to the rest of the cluster nodes. This will also run additional validation. It is necessary that ricci be running in each cluster node to be able to propagate updated cluster configuration information.
Verify that the updated configuration file has been propagated.
Proceed to Section 7.4, "Configuring Failover Domains".

If required, you can configure complex configurations with multiple fence methods per node and with multiple fence devices per fence method. When specifying multiple fence methods per node, if fencing fails using the first method, fenced, the fence daemon, tries the next method, and continues to cycle through methods until one succeeds.

Sometimes, fencing a node requires disabling two I/O paths or two power ports. This is done by specifying two or more devices within a fence method. fenced runs the fence agent once for each fence-device line; all must succeed for fencing to be considered successful.

More complex configurations are shown in the section called "Fencing Configuration Examples".

You can find more information about configuring specific fence devices from a fence-device agent man page (for example, the man page for fence_apc). In addition, you can get more information about fencing parameters from Appendix A, Fence Device Parameters, the fence agents in /usr/sbin/, the cluster schema at /usr/share/cluster/cluster.rng, and the annotated schema at /usr/share/doc/cman-X.Y.ZZ/cluster_conf.html (for example, /usr/share/doc/cman-3.0.12/cluster_conf.html).

Fencing Configuration Examples

The following examples show a simple configuration with one fence method per node and one fence device per fence method:

The following examples show more complex configurations:

Note

The examples in this section are not exhaustive; that is, there may be other ways to configure fencing depending on your requirements.

Example 7.3. APC Fence Device Added to cluster.conf

<cluster name="mycluster" config_version="3">   <clusternodes> <clusternode name="node-01.example.com" nodeid="1"> <fence> </fence> </clusternode> <clusternode name="node-02.example.com" nodeid="2"> <fence> </fence> </clusternode> <clusternode name="node-03.example.com" nodeid="3"> <fence> </fence> </clusternode>   </clusternodes>   <fencedevices> <fencedevice agent="fence_apc" ipaddr="apc_ip_example" login="login_example" name="apc" passwd="password_example"/>   </fencedevices>   <rm>   </rm></cluster>

In this example, a fence device (fencedevice) has been added to the fencedevices element, specifying the fence agent (agent) as fence_apc, the IP address (ipaddr) as apc_ip_example, the login (login) as login_example, the name of the fence device (name) as apc, and the password (passwd) as password_example.

Example 7.4. Fence Methods Added to cluster.conf

<cluster name="mycluster" config_version="3">   <clusternodes> <clusternode name="node-01.example.com" nodeid="1"> <fence> <method name="APC">  <device name="apc" port="1"/> </method> </fence> </clusternode> <clusternode name="node-02.example.com" nodeid="2"> <fence> <method name="APC">  <device name="apc" port="2"/> </method> </fence> </clusternode> <clusternode name="node-03.example.com" nodeid="3"> <fence> <method name="APC">  <device name="apc" port="3"/> </method> </fence> </clusternode>   </clusternodes>   <fencedevices> <fencedevice agent="fence_apc" ipaddr="apc_ip_example" login="login_example" name="apc" passwd="password_example"/>   </fencedevices>   <rm>   </rm></cluster>

In this example, a fence method (method) has been added to each node. The name of the fence method (name) for each node is APC. The device (device) for the fence method in each node specifies the name (name) as apc and a unique APC switch power port number (port) for each node. For example, the port number for node-01.example.com is 1 (port="1"). The device name for each node (device name="apc") points to the fence device by the name (name) of apc in this line of the fencedevices element: fencedevice agent="fence_apc" ipaddr="apc_ip_example" login="login_example" name="apc" passwd="password_example".

Example 7.5. cluster.conf: Multiple Fence Methods per Node

<cluster name="mycluster" config_version="3">   <clusternodes> <clusternode name="node-01.example.com" nodeid="1"> <fence> <method name="APC">  <device name="apc" port="1"/> </method> <method name="SAN">  <device name="sanswitch1" port="11"/> </method> </fence> <unfence> <device name="sanswitch1" port="11" action="on"/>  </unfence </clusternode> <clusternode name="node-02.example.com" nodeid="2"> <fence> <method name="APC">  <device name="apc" port="2"/> </method> <method name="SAN">  <device name="sanswitch1" port="12"/> </method> </fence> <unfence> <device name="sanswitch1" port="12" action="on"/>  </unfence </clusternode> <clusternode name="node-03.example.com" nodeid="3"> <fence> <method name="APC">  <device name="apc" port="3"/> </method> <method name="SAN">  <device name="sanswitch1" port="13"/> </method> </fence> <unfence> <device name="sanswitch1" port="13" action="on"/>  </unfence </clusternode>   </clusternodes>   <fencedevices> <fencedevice agent="fence_apc" ipaddr="apc_ip_example" login="login_example" name="apc" passwd="password_example"/> <fencedevice agent="fence_sanbox2" ipaddr="san_ip_example"login="login_example" name="sanswitch1" passwd="password_example"/>   </fencedevices>   <rm>   </rm></cluster>

Example 7.6. cluster.conf: Fencing, Multipath Multiple Ports

<cluster name="mycluster" config_version="3">   <clusternodes> <clusternode name="node-01.example.com" nodeid="1"> <fence> <method name="SAN-multi">  <device name="sanswitch1" port="11"/>  <device name="sanswitch2" port="11"/> </method> </fence> <unfence> <device name="sanswitch1" port="11" action="on"/> <device name="sanswitch2" port="11" action="on"/> </unfence </clusternode> <clusternode name="node-02.example.com" nodeid="2"> <fence> <method name="SAN-multi">  <device name="sanswitch1" port="12"/>  <device name="sanswitch2" port="12"/> </method> </fence> <unfence> <device name="sanswitch1" port="12" action="on"/> <device name="sanswitch2" port="12" action="on"/> </unfence </clusternode> <clusternode name="node-03.example.com" nodeid="3"> <fence> <method name="SAN-multi">  <device name="sanswitch1" port="13"/>  <device name="sanswitch2" port="13"/> </method> </fence> <unfence> <device name="sanswitch1" port="13" action="on"/> <device name="sanswitch2" port="13" action="on"/> </unfence </clusternode>   </clusternodes>   <fencedevices> <fencedevice agent="fence_sanbox2" ipaddr="san_ip_example"login="login_example" name="sanswitch1" passwd="password_example"/> <fencedevice agent="fence_sanbox2" ipaddr="san_ip_example"login="login_example" name="sanswitch2" passwd="password_example"/> </fencedevices>   <rm>   </rm></cluster>

Example 7.7. cluster.conf: Fencing Nodes with Dual Power Supplies

<cluster name="mycluster" config_version="3">   <clusternodes> <clusternode name="node-01.example.com" nodeid="1"> <fence> <method name="APC-dual">  <device name="apc1" port="1"action="off"/>  <device name="apc2" port="1"action="off"/>  <device name="apc1" port="1"action="on"/>  <device name="apc2" port="1"action="on"/> </method> </fence> </clusternode> <clusternode name="node-02.example.com" nodeid="2"> <fence> <method name="APC-dual">  <device name="apc1" port="2"action="off"/>  <device name="apc2" port="2"action="off"/>  <device name="apc1" port="2"action="on"/>  <device name="apc2" port="2"action="on"/> </method> </fence> </clusternode> <clusternode name="node-03.example.com" nodeid="3"> <fence> <method name="APC-dual">  <device name="apc1" port="3"action="off"/>  <device name="apc2" port="3"action="off"/>  <device name="apc1" port="3"action="on"/>  <device name="apc2" port="3"action="on"/> </method> </fence> </clusternode>   </clusternodes>   <fencedevices>   <fencedevice agent="fence_apc" ipaddr="apc_ip_example" login="login_example" name="apc1" passwd="password_example"/>   <fencedevice agent="fence_apc" ipaddr="apc_ip_example" login="login_example" name="apc2" passwd="password_example"/>   </fencedevices>   <rm>   </rm></cluster>

When using power switches to fence nodes with dual power supplies, the agents must be told to turn off both power ports before restoring power to either port. The default off-on behavior of the agent could result in the power never being fully disabled to the node.

7.4. Configuring Failover Domains

A failover domain is a named subset of cluster nodes that are eligible to run a cluster service in the event of a node failure. A failover domain can have the following characteristics:

Unrestricted - Allows you to specify that a subset of members are preferred, but that a cluster service assigned to this domain can run on any available member.
Restricted - Allows you to restrict the members that can run a particular cluster service. If none of the members in a restricted failover domain are available, the cluster service cannot be started (either manually or by the cluster software).
Unordered - When a cluster service is assigned to an unordered failover domain, the member on which the cluster service runs is chosen from the available failover domain members with no priority ordering.
Ordered - Allows you to specify a preference order among the members of a failover domain. Ordered failover domains select the node with the lowest priority number first. That is, the node in a failover domain with a priority number of "1" specifies the highest priority, and therefore is the most preferred node in a failover domain. After that node, the next preferred node would be the node with the next highest priority number, and so on.
Failback - Allows you to specify whether a service in the failover domain should fail back to the node that it was originally running on before that node failed. Configuring this characteristic is useful in circumstances where a node repeatedly fails and is part of an ordered failover domain. In that circumstance, if a node is the preferred node in a failover domain, it is possible for a service to fail over and fail back repeatedly between the preferred node and another node, causing severe impact on performance.
Note
The failback characteristic is applicable only if ordered failover is configured.

Note

Changing a failover domain configuration has no effect on currently running services.

Note

Failover domains are not required for operation.

By default, failover domains are unrestricted and unordered.

In a cluster with several members, using a restricted failover domain can minimize the work to set up the cluster to run a cluster service (such as httpd), which requires you to set up the configuration identically on all members that run the cluster service. Instead of setting up the entire cluster to run the cluster service, you can set up only the members in the restricted failover domain that you associate with the cluster service.

Note

To configure a preferred member, you can create an unrestricted failover domain comprising only one cluster member. Doing that causes a cluster service to run on that cluster member primarily (the preferred member), but allows the cluster service to fail over to any of the other members.

To configure a failover domain, use the following procedures:

Open /etc/cluster/cluster.conf at any node in the cluster.
Add the following skeleton section within the rm element for each failover domain to be used:
```
 <failoverdomains> <failoverdomain name="" nofailback="" ordered="" restricted=""> <failoverdomainnode name="" priority=""/> <failoverdomainnode name="" priority=""/> <failoverdomainnode name="" priority=""/> </failoverdomain> </failoverdomains>
```
Note
The number of failoverdomainnode attributes depends on the number of nodes in the failover domain. The skeleton failoverdomain section in preceding text shows three failoverdomainnode elements (with no node names specified), signifying that there are three nodes in the failover domain.
In the failoverdomain section, provide the values for the elements and attributes. For descriptions of the elements and attributes, refer to the failoverdomain section of the annotated cluster schema. The annotated cluster schema is available at /usr/share/doc/cman-X.Y.ZZ/cluster_conf.html (for example /usr/share/doc/cman-3.0.12/cluster_conf.html) in any of the cluster nodes. For an example of a failoverdomains section, refer to Example 7.8, "A Failover Domain Added to cluster.conf ".
Update the config_version attribute by incrementing its value (for example, changing from config_version="2" to config_version="3">).
Save /etc/cluster/cluster.conf.
(Optional) Validate the file with against the cluster schema (cluster.rng) by running the ccs_config_validate command. For example:
```
[root@example-01 ~]# ccs_config_validate Configuration validates
```
Run the cman_tool version -r command to propagate the configuration to the rest of the cluster nodes.
Proceed to Section 7.5, "Configuring HA Services".

Example 7.8, "A Failover Domain Added to cluster.conf " shows an example of a configuration with an ordered, unrestricted failover domain.

Example 7.8. A Failover Domain Added to cluster.conf

<cluster name="mycluster" config_version="3">   <clusternodes> <clusternode name="node-01.example.com" nodeid="1"> <fence> <method name="APC">  <device name="apc" port="1"/> </method> </fence> </clusternode> <clusternode name="node-02.example.com" nodeid="2"> <fence> <method name="APC">  <device name="apc" port="2"/> </method> </fence> </clusternode> <clusternode name="node-03.example.com" nodeid="3"> <fence> <method name="APC">  <device name="apc" port="3"/> </method> </fence> </clusternode>   </clusternodes>   <fencedevices> <fencedevice agent="fence_apc" ipaddr="apc_ip_example" login="login_example" name="apc" passwd="password_example"/>   </fencedevices>   <rm>   <failoverdomains>   <failoverdomain name="example_pri" nofailback="0" ordered="1" restricted="0">   <failoverdomainnode name="node-01.example.com" priority="1"/>   <failoverdomainnode name="node-02.example.com" priority="2"/>   <failoverdomainnode name="node-03.example.com" priority="3"/>   </failoverdomain>   </failoverdomains>   </rm></cluster>

The failoverdomains section contains a failoverdomain section for each failover domain in the cluster. This example has one failover domain. In the failoverdomain line, the name (name) is specified as example_pri. In addition, it specifies no failback (failback="0"), that failover is ordered (ordered="1"), and that the failover domain is unrestricted (restricted="0").

7.5. Configuring HA Services

Configuring HA (High Availability) services consists of configuring resources and assigning them to services.

The following sections describe how to edit /etc/cluster/cluster.conf to add resources and services.

Important

There can be a wide range of configurations possible with High Availability resources and services. For a better understanding about resource parameters and resource behavior, refer to Appendix B, HA Resource Parameters and Appendix C, HA Resource Behavior. For optimal performance and to ensure that your configuration can be supported, contact an authorized Red Hat support representative.

7.5.1. Adding Cluster Resources

You can configure two types of resources:

Global - Resources that are available to any service in the cluster. These are configured in the resources section of the configuration file (within the rm element).
Service-specific - Resources that are available to only one service. These are configured in each service section of the configuration file (within the rm element).

This section describes how to add a global resource. For procedures about configuring service-specific resources, refer to Section 7.5.2, "Adding a Cluster Service to the Cluster".

To add a global cluster resource, follow the steps in this section.

Open /etc/cluster/cluster.conf at any node in the cluster.
Add a resources section within the rm element. For example:
```
 <rm> <resources> </resources> </rm>
```
Populate it with resources according to the services you want to create. For example, here are resources that are to be used in an Apache service. They consist of a file system (fs) resource, an IP (ip) resource, and an Apache (apache) resource.
```
 <rm> <resources>   <fs name="web_fs" device="/dev/sdd2" mountpoint="/var/www" fstype="ext3"/>   <ip address="127.143.131.100" monitor_link="yes" sleeptime="10"/>   <apache config_file="conf/httpd.conf" name="example_server" server_root="/etc/httpd" shutdown_wait="0"/> </resources> </rm>
```
Example 7.9, "cluster.conf File with Resources Added " shows an example of a cluster.conf file with the resources section added.
Update the config_version attribute by incrementing its value (for example, changing from config_version="2" to config_version="3").
Save /etc/cluster/cluster.conf.
(Optional) Validate the file with against the cluster schema (cluster.rng) by running the ccs_config_validate command. For example:
```
[root@example-01 ~]# ccs_config_validate Configuration validates
```
Run the cman_tool version -r command to propagate the configuration to the rest of the cluster nodes.
Verify that the updated configuration file has been propagated.
Proceed to Section 7.5.2, "Adding a Cluster Service to the Cluster".

Example 7.9. cluster.conf File with Resources Added

<cluster name="mycluster" config_version="3">   <clusternodes> <clusternode name="node-01.example.com" nodeid="1"> <fence> <method name="APC">  <device name="apc" port="1"/> </method> </fence> </clusternode> <clusternode name="node-02.example.com" nodeid="2"> <fence> <method name="APC">  <device name="apc" port="2"/> </method> </fence> </clusternode> <clusternode name="node-03.example.com" nodeid="3"> <fence> <method name="APC">  <device name="apc" port="3"/> </method> </fence> </clusternode>   </clusternodes>   <fencedevices> <fencedevice agent="fence_apc" ipaddr="apc_ip_example" login="login_example" name="apc" passwd="password_example"/>   </fencedevices>   <rm>   <failoverdomains>   <failoverdomain name="example_pri" nofailback="0" ordered="1" restricted="0">   <failoverdomainnode name="node-01.example.com" priority="1"/>   <failoverdomainnode name="node-02.example.com" priority="2"/>   <failoverdomainnode name="node-03.example.com" priority="3"/>   </failoverdomain>   </failoverdomains>   <resources>   <fs name="web_fs" device="/dev/sdd2" mountpoint="/var/www" fstype="ext3"/>   <ip address="127.143.131.100" monitor_link="yes" sleeptime="10"/>   <apache config_file="conf/httpd.conf" name="example_server" server_root="/etc/httpd" shutdown_wait="0"/> </resources>   </rm></cluster>

7.5.2. Adding a Cluster Service to the Cluster

To add a cluster service to the cluster, follow the steps in this section.

Open /etc/cluster/cluster.conf at any node in the cluster.

Add a service section within the rm element for each service. For example:

 <rm> <service autostart="1" domain="" exclusive="0" name="" recovery="restart"> </service> </rm>

Configure the following parameters (attributes) in the service element:
- autostart - Specifies whether to autostart the service when the cluster starts. Use '1' to enable and '0' to disable; the default is enabled.
- domain - Specifies a failover domain (if required).
- exclusive - Specifies a policy wherein the service only runs on nodes that have no other services running on them.
- recovery - Specifies a recovery policy for the service. The options are to relocate, restart, disable, or restart-disable the service.

Depending on the type of resources you want to use, populate the service with global or service-specific resources

For example, here is an Apache service that uses global resources:

 <rm> <resources> <fs name="web_fs" device="/dev/sdd2" mountpoint="/var/www" fstype="ext3"/> <ip address="127.143.131.100" monitor_link="yes" sleeptime="10"/> <apache config_file="conf/httpd.conf" name="example_server" server_root="/etc/httpd" shutdown_wait="0"/> </resources> <service autostart="1" domain="example_pri" exclusive="0" name="example_apache" recovery="relocate"> <fs ref="web_fs"/> <ip ref="127.143.131.100"/> <apache ref="example_server"/> </service> </rm>

For example, here is an Apache service that uses service-specific resources:

 <rm> <service autostart="0" domain="example_pri" exclusive="0" name="example_apache2" recovery="relocate"> <fs name="web_fs2" device="/dev/sdd3" mountpoint="/var/www2" fstype="ext3"/> <ip address="127.143.131.101" monitor_link="yes" sleeptime="10"/> <apache config_file="conf/httpd.conf" name="example_server2" server_root="/etc/httpd" shutdown_wait="0"/> </service> </rm>

Example 7.10, "cluster.conf with Services Added: One Using Global Resources and One Using Service-Specific Resources " shows an example of a cluster.conf file with two services:

example_apache - This service uses global resources web_fs, 127.143.131.100, and example_server.
example_apache2 - This service uses service-specific resources web_fs2, 127.143.131.101, and example_server2.

Update the config_version attribute by incrementing its value (for example, changing from config_version="2" to config_version="3">).
Save /etc/cluster/cluster.conf.
(Optional) Validate the updated file against the cluster schema (cluster.rng) by running the ccs_config_validate command. For example:
```
[root@example-01 ~]# ccs_config_validate Configuration validates
```
Run the cman_tool version -r command to propagate the configuration to the rest of the cluster nodes.
Verify that the updated configuration file has been propagated.
Proceed to Section 7.8, "Verifying a Configuration".

Example 7.10. cluster.conf with Services Added: One Using Global Resources and One Using Service-Specific Resources

<cluster name="mycluster" config_version="3">   <clusternodes> <clusternode name="node-01.example.com" nodeid="1"> <fence> <method name="APC">  <device name="apc" port="1"/> </method> </fence> </clusternode> <clusternode name="node-02.example.com" nodeid="2"> <fence> <method name="APC">  <device name="apc" port="2"/> </method> </fence> </clusternode> <clusternode name="node-03.example.com" nodeid="3"> <fence> <method name="APC">  <device name="apc" port="3"/> </method> </fence> </clusternode>   </clusternodes>   <fencedevices> <fencedevice agent="fence_apc" ipaddr="apc_ip_example" login="login_example" name="apc" passwd="password_example"/>   </fencedevices>   <rm>   <failoverdomains>   <failoverdomain name="example_pri" nofailback="0" ordered="1" restricted="0">   <failoverdomainnode name="node-01.example.com" priority="1"/>   <failoverdomainnode name="node-02.example.com" priority="2"/>   <failoverdomainnode name="node-03.example.com" priority="3"/>   </failoverdomain>   </failoverdomains>   <resources>   <fs name="web_fs" device="/dev/sdd2" mountpoint="/var/www" fstype="ext3"/>   <ip address="127.143.131.100" monitor_link="yes" sleeptime="10"/>   <apache config_file="conf/httpd.conf" name="example_server" server_root="/etc/httpd" shutdown_wait="0"/>   </resources>   <service autostart="1" domain="example_pri" exclusive="0" name="example_apache" recovery="relocate">   <fs ref="web_fs"/>   <ip ref="127.143.131.100"/>   <apache ref="example_server"/>   </service>   <service autostart="0" domain="example_pri" exclusive="0" name="example_apache2" recovery="relocate">   <fs name="web_fs2" device="/dev/sdd3" mountpoint="/var/www2" fstype="ext3"/>   <ip address="127.143.131.101" monitor_link="yes" sleeptime="10"/>   <apache config_file="conf/httpd.conf" name="example_server2" server_root="/etc/httpd" shutdown_wait="0"/>   </service>   </rm></cluster>

7.6. Configuring Redundant Ring Protocol

As of Red Hat Enterprise Linux 6.4, the Red Hat High Availability Add-On supports the configuration of redundant ring protocol.

When configuring a system to use redundant ring protocol, you must take the following considerations into account:

Do not specify more than two rings.
Each ring must use the same protocol; do not mix IPv4 and IPv6.
If necessary, you can manually specify a multicast address for the second ring. If you specify a multicast address for the second ring, either the alternate multicast address or the alternate port must be different from the multicast address for the first ring. If you do not specify an alternate multicast address, the system will automatically use a different multicast address for the second ring.
If you specify an alternate port, the port numbers of the first ring and the second ring must differ by at least two, since the system itself uses port and port-1 to perform operations.
Do not use two different interfaces on the same subnet.
In general, it is a good practice to configure redundant ring protocol on two different NICs and two different switches, in case one NIC or one switch fails.
Do not use the ifdown command or the service network stop command to simulate network failure. This destroys the whole cluster and requires that you restart all of the nodes in the cluster to recover.
Do not use NetworkManager, since it will execute the ifdown command if the cable is unplugged.
When one node of a NIC fails, the entire ring is marked as failed.
No manual intervention is required to recover a failed ring. To recover, you only need to fix the original reason for the failure, such as a failed NIC or switch.

To specify a second network interface to use for redundant ring protocol, you add an altname component to the clusternode section of the cluster.conf configuration file. When specifying altname, you must specify a name attribute to indicate a second host name or IP address for the node.

The following example specifies clusternet-node1-eth2 as the alternate name for cluster node clusternet-node1-eth1.

<cluster name="mycluster" config_version="3" >  <logging debug="on"/>  <clusternodes> <clusternode name="clusternet-node1-eth1" votes="1" nodeid="1">  <fence> <method name="single">  <device name="xvm" domain="clusternet-node1"/> </method>  </fence>  <altname name="clusternet-node1-eth2"/> </clusternode>

The altname section within the clusternode block is not position dependent. It can come before or after the fence section. Do not specify more than one altname component for a cluster node or the system will fail to start.

Optionally, you can manually specify a multicast address, a port, and a TTL for the second ring by including an altmulticast component in the cman section of the cluster.conf configuration file. The altmulticast component accepts an addr, a port, and a ttl parameter.

The following example shows the cman section of a cluster configuration file that sets a multicast address, port, and TTL for the second ring.

<cman>   <multicast addr="239.192.99.73" port="666" ttl="2"/>   <altmulticast addr="239.192.99.88" port="888" ttl="3"/></cman>

7.7. Configuring Debug Options

You can enable debugging for all daemons in a cluster, or you can enable logging for specific cluster processing.

To enable debugging for all daemons, add the following to the /etc/cluster/cluster.conf. By default, logging is directed to the /var/log/cluster/daemon.log file.

<cluster config_version="7" name="rh6cluster">  <logging debug="on"/>   ...  </cluster>

To enable debugging for individual cluster processes, add the following lines to the the /etc/cluster/cluster.conf file. Per-daemon logging configuration overrides the global settings.

<cluster config_version="7" name="rh6cluster">   ...  <logging>   <!-- turning on per-subsystem debug logging --> <logging_daemon name="corosync" debug="on" /> <logging_daemon name="fenced" debug="on" /> <logging_daemon name="qdiskd" debug="on" /> <logging_daemon name="rgmanager" debug="on" />  <logging_daemon name="dlm_controld" debug="on" /> <logging_daemon name="gfs_controld" debug="on" />  </logging> ...</cluster>

For a list of the logging daemons for which you can enable logging as well as the additional logging options you can configure for both global and per-daemon logging, refer to the cluster.conf(5) man page.

7.8. Verifying a Configuration

Once you have created your cluster configuration file, verify that it is running correctly by performing the following steps:

At each node, restart the cluster software. That action ensures that any configuration additions that are checked only at startup time are included in the running configuration. You can restart the cluster software by running service cman restart. For example:

[root@example-01 ~]# service cman restartStopping cluster: Leaving fence domain... [  OK  ]   Stopping gfs_controld... [  OK  ]   Stopping dlm_controld... [  OK  ]   Stopping fenced...  [  OK  ]   Stopping cman... [  OK  ]   Waiting for corosync to shutdown:   [  OK  ]   Unloading kernel modules... [  OK  ]   Unmounting configfs...  [  OK  ]Starting cluster: Checking Network Manager... [  OK  ]   Global setup... [  OK  ]   Loading kernel modules...   [  OK  ]   Mounting configfs... [  OK  ]   Starting cman... [  OK  ]   Waiting for quorum...   [  OK  ]   Starting fenced...  [  OK  ]   Starting dlm_controld... [  OK  ]   Starting gfs_controld... [  OK  ]   Unfencing self...   [  OK  ]   Joining fence domain... [  OK  ]

Run service clvmd start, if CLVM is being used to create clustered volumes. For example:
```
[root@example-01 ~]# service clvmd startActivating VGs: [  OK  ]
```

Run service gfs2 start, if you are using Red Hat GFS2. For example:

[root@example-01 ~]# service gfs2 startMounting GFS2 filesystem (/mnt/gfsA):  [  OK  ]Mounting GFS2 filesystem (/mnt/gfsB):  [  OK  ]

Run service rgmanager start, if you using high-availability (HA) services. For example:

[root@example-01 ~]# service rgmanager startStarting Cluster Service Manager:  [  OK  ]

At any cluster node, run cman_tool nodes to verify that the nodes are functioning as members in the cluster (signified as "M" in the status column, "Sts"). For example:

[root@example-01 ~]# cman_tool nodesNode  Sts   Inc   Joined   Name   1   M 548   2010-09-28 10:52:21  node-01.example.com   2   M 548   2010-09-28 10:52:21  node-02.example.com   3   M 544   2010-09-28 10:52:21  node-03.example.com

At any node, using the clustat utility, verify that the HA services are running as expected. In addition, clustat displays status of the cluster nodes. For example:

[root@example-01 ~]#clustatCluster Status for mycluster @ Wed Nov 17 05:40:00 2010Member Status: Quorate Member Name ID   Status ------ ---- ---- ------ node-03.example.com 3 Online, rgmanager node-02.example.com 2 Online, rgmanager node-01.example.com 1 Online, Local, rgmanager Service Name   Owner (Last)   State  ------- ----   ----- ------   ----- service:example_apache node-01.example.com started service:example_apache2 (none) disabled

If the cluster is running as expected, you are done with creating a configuration file. You can manage the cluster with command-line tools described in Chapter 8, Managing Red Hat High Availability Add-On With Command Line Tools.

Cluster Administration

Chapter 7. Configuring Red Hat High Availability Add-On With Command Line Tools

7.1. Configuration Tasks

7.2. Creating a Basic Cluster Configuration File

Basic Configuration Examples

The consensus Value for totem in a Two-Node Cluster

7.3. Configuring Fencing

Fencing Configuration Examples

7.4. Configuring Failover Domains

7.5. Configuring HA Services

7.5.1. Adding Cluster Resources

7.5.2. Adding a Cluster Service to the Cluster

7.6. Configuring Redundant Ring Protocol

7.7. Configuring Debug Options

7.8. Verifying a Configuration

The `consensus` Value for `totem` in a Two-Node Cluster