Before Configuring the Red Hat High Availability Add-On - Red Hat Enterprise Linux Manual

Chapter 2. Before Configuring the Red Hat High Availability Add-On

This chapter describes tasks to perform and considerations to make before installing and configuring the Red Hat High Availability Add-On, and consists of the following sections.

Important

Make sure that your deployment of Red Hat High Availability Add-On meets your needs and can be supported. Consult with an authorized Red Hat representative to verify your configuration prior to deployment. In addition, allow time for a configuration burn-in period to test failure modes.

2.1. General Configuration Considerations

You can configure the Red Hat High Availability Add-On in a variety of ways to suit your needs. Take into account the following general considerations when you plan, configure, and implement your deployment.

Number of cluster nodes supported: The maximum number of cluster nodes supported by the High Availability Add-On is 16.
Single site clusters: Only single site clusters are fully supported at this time. Clusters spread across multiple physical locations are not formally supported. For more details and to discuss multi-site clusters, please speak to your Red Hat sales or support representative.
GFS2: Although a GFS2 file system can be implemented in a standalone system or as part of a cluster configuration, Red Hat does not support the use of GFS2 as a single-node file system. Red Hat does support a number of high-performance single-node file systems that are optimized for single node, and thus have generally lower overhead than a cluster file system. Red Hat recommends using those file systems in preference to GFS2 in cases where only a single node needs to mount the file system. Red Hat will continue to support single-node GFS2 file systems for existing customers.
When you configure a GFS2 file system as a cluster file system, you must ensure that all nodes in the cluster have access to the shared file system. Asymmetric cluster configurations in which some nodes have access to the file system and others do not are not supported.This does not require that all nodes actually mount the GFS2 file system itself.
No-single-point-of-failure hardware configuration: Clusters can include a dual-controller RAID array, multiple bonded network channels, multiple paths between cluster members and storage, and redundant un-interruptible power supply (UPS) systems to ensure that no single failure results in application down time or loss of data.
Alternatively, a low-cost cluster can be set up to provide less availability than a no-single-point-of-failure cluster. For example, you can set up a cluster with a single-controller RAID array and only a single Ethernet channel.
Certain low-cost alternatives, such as host RAID controllers, software RAID without cluster support, and multi-initiator parallel SCSI configurations are not compatible or appropriate for use as shared cluster storage.
Data integrity assurance: To ensure data integrity, only one node can run a cluster service and access cluster-service data at a time. The use of power switches in the cluster hardware configuration enables a node to power-cycle another node before restarting that node's HA services during a failover process. This prevents two nodes from simultaneously accessing the same data and corrupting it. Fence devices (hardware or software solutions that remotely power, shutdown, and reboot cluster nodes) are used to guarantee data integrity under all failure conditions.
Ethernet channel bonding: Cluster quorum and node health is determined by communication of messages among cluster nodes via Ethernet. In addition, cluster nodes use Ethernet for a variety of other critical cluster functions (for example, fencing). With Ethernet channel bonding, multiple Ethernet interfaces are configured to behave as one, reducing the risk of a single-point-of-failure in the typical switched Ethernet connection among cluster nodes and other cluster hardware.
As of Red Hat Enterprise Linux 6.4, bonding modes 0, 1, and 2 are supported.
IPv4 and IPv6: The High Availability Add-On supports both IPv4 and IPv6 Internet Protocols. Support of IPv6 in the High Availability Add-On is new for Red Hat Enterprise Linux 6.

2.2. Compatible Hardware

Before configuring Red Hat High Availability Add-On software, make sure that your cluster uses appropriate hardware (for example, supported fence devices, storage devices, and Fibre Channel switches). Refer to the hardware configuration guidelines at http://www.redhat.com/cluster_suite/hardware/ for the most current hardware compatibility information.

2.3. Enabling IP Ports

Before deploying the Red Hat High Availability Add-On, you must enable certain IP ports on the cluster nodes and on computers that run luci (the Conga user interface server). The following sections identify the IP ports to be enabled:

The following section provides the iptables rules for enabling IP ports need by the Red Hat High Availability Add-On:

Section 2.3.3, "Configuring the iptables Firewall to Allow Cluster Components"

2.3.1. Enabling IP Ports on Cluster Nodes

To allow the nodes in a cluster to communicate with each other, you must enable the IP ports assigned to certain Red Hat High Availability Add-On components. Table 2.1, "Enabled IP Ports on Red Hat High Availability Add-On Nodes" lists the IP port numbers, their respective protocols, and the components to which the port numbers are assigned. At each cluster node, enable IP ports according to Table 2.1, "Enabled IP Ports on Red Hat High Availability Add-On Nodes". You can use system-config-firewall to enable the IP ports.

Table 2.1. Enabled IP Ports on Red Hat High Availability Add-On Nodes

IP Port Number	Protocol	Component
5404, 5405	UDP	`corosync/cman` (Cluster Manager)
11111	TCP	`ricci` (propagates updated cluster information)
21064	TCP	`dlm` (Distributed Lock Manager)
16851	TCP	`modclusterd`

2.3.2. Enabling the IP Port for luci

To allow client computers to communicate with a computer that runs luci (the Conga user interface server), you must enable the IP port assigned to luci. At each computer that runs luci, enable the IP port according to Table 2.2, "Enabled IP Port on a Computer That Runs luci".

Note

If a cluster node is running luci, port 11111 should already have been enabled.

Table 2.2. Enabled IP Port on a Computer That Runs luci

IP Port Number	Protocol	Component
8084	TCP	luci (Conga user interface server)

As of the Red Hat Enterprise Linux 6.1 release, which enabled configuration by means of the /etc/sysconfig/luci file, you can specifically configure the only IP address luci is being served at. You can use this capability if your server infrastructure incorporates more than one network and you want to access luci from the internal network only. To do this, uncomment and edit the line in the file that specifies host. For example, to change the host setting in the file to 10.10.10.10, edit the host line as follows:

host = 10.10.10.10

For more information on the /etc/sysconfig/luci file, refer to Section 2.4, "Configuring luci with /etc/sysconfig/luci".

2.3.3. Configuring the iptables Firewall to Allow Cluster Components

Listed below are example iptable rules for enabling IP ports needed by Red Hat Enterprise Linux 6 (with High Availability Add-on). Please note that these examples use 192.168.1.0/24 as a subnet, but you will need to replace 192.168.1.0/24 with the appropriate subnet if you use these rules.

For cman (Cluster Manager), use the following filtering.

$ iptables -I INPUT -m state --state NEW -m multiport -p udp -s 192.168.1.0/24 -d 192.168.1.0/24 --dports 5404,5405 -j ACCEPT$ iptables -I INPUT -m addrtype --dst-type MULTICAST -m state --state NEW -m multiport -p udp -s 192.168.1.0/24 --dports 5404,5405 -j ACCEPT

For dlm (Distributed Lock Manager):

$ iptables -I INPUT -m state --state NEW -p tcp -s 192.168.1.0/24 -d 192.168.1.0/24 --dport 21064 -j ACCEPT

For ricci (part of Conga remote agent):

$ iptables -I INPUT -m state --state NEW -p tcp -s 192.168.1.0/24 -d 192.168.1.0/24 --dport 11111 -j ACCEPT

For modclusterd (part of Conga remote agent):

$ iptables -I INPUT -m state --state NEW -p tcp -s 192.168.1.0/24 -d 192.168.1.0/24 --dport 16851 -j ACCEPT

For luci (Conga User Interface server):

$ iptables -I INPUT -m state --state NEW -p tcp -s 192.168.1.0/24 -d 192.168.1.0/24 --dport 16851 -j ACCEPT

For igmp (Internet Group Management Protocol):

$ iptables -I INPUT -p igmp -j ACCEPT

After executing these commands, run the following command to save the current configuration for the changes to be persistent during reboot.

$ service iptables save ; service iptables restart

2.4. Configuring luci with `/etc/sysconfig/luci`

As of the Red Hat Enterprise Linux 6.1 release, you can configure some aspects of luci's behavior by means of the /etc/sysconfig/luci file. The parameters you can change with this file include auxiliary settings of the running environment used by the init script as well as server configuration. In addition, you can edit this file to modify some application configuration parameters. There are instructions within the file itself describing which configuration parameters you can change by editing this file.

In order to protect the intended format, you should not change the non-configuration lines of the /etc/sysconfig/luci file when you edit the file. Additionally, you should take care to follow the required syntax for this file, particularly for the INITSCRIPT section which does not allow for white spaces around the equal sign and requires that you use quotation marks to enclose strings containing white spaces.

The following example shows how to change the port at which luci is being served by editing the /etc/sysconfig/luci file.

Uncomment the following line in the /etc/sysconfig/luci file:
```
#port = 4443
```
Replace 4443 with the desired port number, which must be higher than or equal to 1024 (not a privileged port). For example, you can edit that line of the file as follows to set the port at which luci is being served to 8084.
```
port = 8084
```
Restart the luci service for the changes to take effect.

Important

When you modify a configuration parameter in the /etc/sysconfig/luci file to redefine a default value, you should take care to use the new value in place of the documented default value. For example, when you modify the port at which luci is being served, make sure that you specify the modified value when you enable an IP port for luci, as described in Section 2.3.2, "Enabling the IP Port for luci".

Modified port and host parameters will automatically be reflected in the URL displayed when the luci service starts, as described in Section 3.2, "Starting luci". You should use this URL to access luci.

For more complete information on the parameters you can configure with the /etc/sysconfig/luci file, refer to the documentation within the file itself.

2.5. Configuring ACPI For Use with Integrated Fence Devices

If your cluster uses integrated fence devices, you must configure ACPI (Advanced Configuration and Power Interface) to ensure immediate and complete fencing.

Note

For the most current information about integrated fence devices supported by Red Hat High Availability Add-On, refer to http://www.redhat.com/cluster_suite/hardware/.

If a cluster node is configured to be fenced by an integrated fence device, disable ACPI Soft-Off for that node. Disabling ACPI Soft-Off allows an integrated fence device to turn off a node immediately and completely rather than attempting a clean shutdown (for example, shutdown -h now). Otherwise, if ACPI Soft-Off is enabled, an integrated fence device can take four or more seconds to turn off a node (refer to note that follows). In addition, if ACPI Soft-Off is enabled and a node panics or freezes during shutdown, an integrated fence device may not be able to turn off the node. Under those circumstances, fencing is delayed or unsuccessful. Consequently, when a node is fenced with an integrated fence device and ACPI Soft-Off is enabled, a cluster recovers slowly or requires administrative intervention to recover.

Note

The amount of time required to fence a node depends on the integrated fence device used. Some integrated fence devices perform the equivalent of pressing and holding the power button; therefore, the fence device turns off the node in four to five seconds. Other integrated fence devices perform the equivalent of pressing the power button momentarily, relying on the operating system to turn off the node; therefore, the fence device turns off the node in a time span much longer than four to five seconds.

To disable ACPI Soft-Off, use chkconfig management and verify that the node turns off immediately when fenced. The preferred way to disable ACPI Soft-Off is with chkconfig management: however, if that method is not satisfactory for your cluster, you can disable ACPI Soft-Off with one of the following alternate methods:

Changing the BIOS setting to "instant-off" or an equivalent setting that turns off the node without delay
Note
Disabling ACPI Soft-Off with the BIOS may not be possible with some computers.
Appending acpi=off to the kernel boot command line of the /boot/grub/grub.conf file
Important
This method completely disables ACPI; some computers do not boot correctly if ACPI is completely disabled. Use this method only if the other methods are not effective for your cluster.

The following sections provide procedures for the preferred method and alternate methods of disabling ACPI Soft-Off:

Section 2.5.1, "Disabling ACPI Soft-Off with chkconfig Management" - Preferred method
Section 2.5.2, "Disabling ACPI Soft-Off with the BIOS" - First alternate method
Section 2.5.3, "Disabling ACPI Completely in the grub.conf File" - Second alternate method

2.5.1. Disabling ACPI Soft-Off with `chkconfig` Management

You can use chkconfig management to disable ACPI Soft-Off either by removing the ACPI daemon (acpid) from chkconfig management or by turning off acpid.

Note

This is the preferred method of disabling ACPI Soft-Off.

Disable ACPI Soft-Off with chkconfig management at each cluster node as follows:

Run either of the following commands:
- chkconfig --del acpid - This command removes acpid from chkconfig management.
  - OR -
- chkconfig --level 2345 acpid off - This command turns off acpid.
Reboot the node.
When the cluster is configured and running, verify that the node turns off immediately when fenced.
Note
You can fence the node with the fence_node command or Conga.

2.5.2. Disabling ACPI Soft-Off with the BIOS

The preferred method of disabling ACPI Soft-Off is with chkconfig management (Section 2.5.1, "Disabling ACPI Soft-Off with chkconfig Management"). However, if the preferred method is not effective for your cluster, follow the procedure in this section.

Note

Disabling ACPI Soft-Off with the BIOS may not be possible with some computers.

You can disable ACPI Soft-Off by configuring the BIOS of each cluster node as follows:

Reboot the node and start the BIOS CMOS Setup Utility program.
Navigate to the Power menu (or equivalent power management menu).
At the Power menu, set the Soft-Off by PWR-BTTN function (or equivalent) to Instant-Off (or the equivalent setting that turns off the node via the power button without delay). Example 2.1, "BIOS CMOS Setup Utility: Soft-Off by PWR-BTTN set to Instant-Off" shows a Power menu with ACPI Function set to Enabled and Soft-Off by PWR-BTTN set to Instant-Off.
Note
The equivalents to ACPI Function, Soft-Off by PWR-BTTN, and Instant-Off may vary among computers. However, the objective of this procedure is to configure the BIOS so that the computer is turned off via the power button without delay.
Exit the BIOS CMOS Setup Utility program, saving the BIOS configuration.
When the cluster is configured and running, verify that the node turns off immediately when fenced.
Note
You can fence the node with the fence_node command or Conga.

Example 2.1. BIOS CMOS Setup Utility: Soft-Off by PWR-BTTN set to Instant-Off

+---------------------------------------------|-------------------+| ACPI Function [Enabled]  | Item Help  || ACPI Suspend Type [S1(POS)]  |-------------------||  x Run VGABIOS if S3 Resume   Auto  |   Menu Level   *  || Suspend Mode  [Disabled] |   || HDD Power Down [Disabled] |   || Soft-Off by PWR-BTTN  [Instant-Off   |   || CPU THRM-Throttling   [50.0%] |   || Wake-Up by PCI card   [Enabled]  |   || Power On by Ring  [Enabled]  |   || Wake Up On LAN [Enabled]  |   ||  x USB KB Wake-Up From S3 Disabled  |   || Resume by Alarm   [Disabled] |   ||  x  Date(of Month) Alarm   0 |   ||  x  Time(hh:mm:ss) Alarm   0 :  0 : |   || POWER ON Function [BUTTON ONLY   |   ||  x KB Power ON Password   Enter |   ||  x Hot Key Power ON   Ctrl-F1   |   || |   || |   |+---------------------------------------------|-------------------+

This example shows ACPI Function set to Enabled, and Soft-Off by PWR-BTTN set to Instant-Off.

2.5.3. Disabling ACPI Completely in the `grub.conf` File

The preferred method of disabling ACPI Soft-Off is with chkconfig management (Section 2.5.1, "Disabling ACPI Soft-Off with chkconfig Management"). If the preferred method is not effective for your cluster, you can disable ACPI Soft-Off with the BIOS power management (Section 2.5.2, "Disabling ACPI Soft-Off with the BIOS"). If neither of those methods is effective for your cluster, you can disable ACPI completely by appending acpi=off to the kernel boot command line in the grub.conf file.

Important

This method completely disables ACPI; some computers do not boot correctly if ACPI is completely disabled. Use this method only if the other methods are not effective for your cluster.

You can disable ACPI completely by editing the grub.conf file of each cluster node as follows:

Open /boot/grub/grub.conf with a text editor.
Append acpi=off to the kernel boot command line in /boot/grub/grub.conf (refer to Example 2.2, "Kernel Boot Command Line with acpi=off Appended to It").
Reboot the node.
When the cluster is configured and running, verify that the node turns off immediately when fenced.
Note
You can fence the node with the fence_node command or Conga.

Example 2.2. Kernel Boot Command Line with acpi=off Appended to It

# grub.conf generated by anaconda## Note that you do not have to rerun grub after making changes to this file# NOTICE:  You have a /boot partition.  This means that#  all kernel and initrd paths are relative to /boot/, eg.#  root (hd0,0)#  kernel /vmlinuz-version ro root=/dev/mapper/vg_doc01-lv_root #  initrd /initrd-[generic-]version.img#boot=/dev/hdadefault=0timeout=5serial --unit=0 --speed=115200terminal --timeout=5 serial consoletitle Red Hat Enterprise Linux Server (2.6.32-193.el6.x86_64) root (hd0,0) kernel /vmlinuz-2.6.32-193.el6.x86_64 ro root=/dev/mapper/vg_doc01-lv_root console=ttyS0,115200n8 acpi=off initrd /initramrs-2.6.32-131.0.15.el6.x86_64.img

In this example, acpi=off has been appended to the kernel boot command line - the line starting with "kernel /vmlinuz-2.6.32-193.el6.x86_64.img".

2.6. Considerations for Configuring HA Services

You can create a cluster to suit your needs for high availability by configuring HA (high-availability) services. The key component for HA service management in the Red Hat High Availability Add-On, rgmanager, implements cold failover for off-the-shelf applications. In the Red Hat High Availability Add-On, an application is configured with other cluster resources to form an HA service that can fail over from one cluster node to another with no apparent interruption to cluster clients. HA-service failover can occur if a cluster node fails or if a cluster system administrator moves the service from one cluster node to another (for example, for a planned outage of a cluster node).

To create an HA service, you must configure it in the cluster configuration file. An HA service comprises cluster resources. Cluster resources are building blocks that you create and manage in the cluster configuration file - for example, an IP address, an application initialization script, or a Red Hat GFS2 shared partition.

An HA service can run on only one cluster node at a time to maintain data integrity. You can specify failover priority in a failover domain. Specifying failover priority consists of assigning a priority level to each node in a failover domain. The priority level determines the failover order - determining which node that an HA service should fail over to. If you do not specify failover priority, an HA service can fail over to any node in its failover domain. Also, you can specify if an HA service is restricted to run only on nodes of its associated failover domain. (When associated with an unrestricted failover domain, an HA service can start on any cluster node in the event no member of the failover domain is available.)

Figure 2.1, "Web Server Cluster Service Example" shows an example of an HA service that is a web server named "content-webserver". It is running in cluster node B and is in a failover domain that consists of nodes A, B, and D. In addition, the failover domain is configured with a failover priority to fail over to node D before node A and to restrict failover to nodes only in that failover domain. The HA service comprises these cluster resources:

IP address resource - IP address 10.10.10.201.
An application resource named "httpd-content" - a web server application init script /etc/init.d/httpd (specifying httpd).
A file system resource - Red Hat GFS2 named "gfs2-content-webserver".

Web Server Cluster Service Example

Figure 2.1. Web Server Cluster Service Example

Clients access the HA service through the IP address 10.10.10.201, enabling interaction with the web server application, httpd-content. The httpd-content application uses the gfs2-content-webserver file system. If node B were to fail, the content-webserver HA service would fail over to node D. If node D were not available or also failed, the service would fail over to node A. Failover would occur with minimal service interruption to the cluster clients. For example, in an HTTP service, certain state information may be lost (like session data). The HA service would be accessible from another cluster node via the same IP address as it was before failover.

Note

For more information about HA services and failover domains, refer to the High Availability Add-On Overview. For information about configuring failover domains, refer to Chapter 3, Configuring Red Hat High Availability Add-On With Conga(using Conga) or Chapter 7, Configuring Red Hat High Availability Add-On With Command Line Tools (using command line utilities).

An HA service is a group of cluster resources configured into a coherent entity that provides specialized services to clients. An HA service is represented as a resource tree in the cluster configuration file, /etc/cluster/cluster.conf (in each cluster node). In the cluster configuration file, each resource tree is an XML representation that specifies each resource, its attributes, and its relationship among other resources in the resource tree (parent, child, and sibling relationships).

Note

Because an HA service consists of resources organized into a hierarchical tree, a service is sometimes referred to as a resource tree or resource group. Both phrases are synonymous with HA service.

At the root of each resource tree is a special type of resource - a service resource. Other types of resources comprise the rest of a service, determining its characteristics. Configuring an HA service consists of creating a service resource, creating subordinate cluster resources, and organizing them into a coherent entity that conforms to hierarchical restrictions of the service.

There are two major considerations to take into account when configuring an HA service:

The types of resources needed to create a service
Parent, child, and sibling relationships among resources

The types of resources and the hierarchy of resources depend on the type of service you are configuring.

The types of cluster resources are listed in Appendix B, HA Resource Parameters. Information about parent, child, and sibling relationships among resources is described in Appendix C, HA Resource Behavior.

2.7. Configuration Validation

The cluster configuration is automatically validated according to the cluster schema at /usr/share/cluster/cluster.rng during startup time and when a configuration is reloaded. Also, you can validate a cluster configuration any time by using the ccs_config_validate command. For information on configuration validation when using the ccs command, see Section 5.1.6, "Configuration Validation".

An annotated schema is available for viewing at /usr/share/doc/cman-X.Y.ZZ/cluster_conf.html (for example /usr/share/doc/cman-3.0.12/cluster_conf.html).

Configuration validation checks for the following basic errors:

XML validity - Checks that the configuration file is a valid XML file.
Configuration options - Checks to make sure that options (XML elements and attributes) are valid.
Option values - Checks that the options contain valid data (limited).

The following examples show a valid configuration and invalid configurations that illustrate the validation checks:

Valid configuration - Example 2.3, "cluster.conf Sample Configuration: Valid File"
Invalid XML - Example 2.4, "cluster.conf Sample Configuration: Invalid XML"
Invalid option - Example 2.5, "cluster.conf Sample Configuration: Invalid Option"
Invalid option value - Example 2.6, "cluster.conf Sample Configuration: Invalid Option Value"

Example 2.3. cluster.conf Sample Configuration: Valid File

<cluster name="mycluster" config_version="1">  <logging debug="off"/>   <clusternodes> <clusternode name="node-01.example.com" nodeid="1"> <fence> </fence> </clusternode> <clusternode name="node-02.example.com" nodeid="2"> <fence> </fence> </clusternode> <clusternode name="node-03.example.com" nodeid="3"> <fence> </fence> </clusternode>   </clusternodes>   <fencedevices>   </fencedevices>   <rm>   </rm></cluster>

Example 2.4. cluster.conf Sample Configuration: Invalid XML

<cluster name="mycluster" config_version="1">  <logging debug="off"/>   <clusternodes> <clusternode name="node-01.example.com" nodeid="1"> <fence> </fence> </clusternode> <clusternode name="node-02.example.com" nodeid="2"> <fence> </fence> </clusternode> <clusternode name="node-03.example.com" nodeid="3"> <fence> </fence> </clusternode>   </clusternodes>   <fencedevices>   </fencedevices>   <rm>   </rm><cluster> <----------------INVALID

In this example, the last line of the configuration (annotated as "INVALID" here) is missing a slash - it is <cluster> instead of </cluster>.

Example 2.5. cluster.conf Sample Configuration: Invalid Option

<cluster name="mycluster" config_version="1">  <loging debug="off"/> <----------------INVALID   <clusternodes> <clusternode name="node-01.example.com" nodeid="1"> <fence> </fence> </clusternode> <clusternode name="node-02.example.com" nodeid="2"> <fence> </fence> </clusternode> <clusternode name="node-03.example.com" nodeid="3"> <fence> </fence> </clusternode>   </clusternodes>   <fencedevices>   </fencedevices>   <rm>   </rm><cluster>

In this example, the second line of the configuration (annotated as "INVALID" here) contains an invalid XML element - it is loging instead of logging.

Example 2.6. cluster.conf Sample Configuration: Invalid Option Value

<cluster name="mycluster" config_version="1">  <loging debug="off"/>   <clusternodes> <clusternode name="node-01.example.com" nodeid="-1">  <--------INVALID <fence> </fence> </clusternode> <clusternode name="node-02.example.com" nodeid="2"> <fence> </fence> </clusternode> <clusternode name="node-03.example.com" nodeid="3"> <fence> </fence> </clusternode>   </clusternodes>   <fencedevices>   </fencedevices>   <rm>   </rm><cluster>

In this example, the fourth line of the configuration (annotated as "INVALID" here) contains an invalid value for the XML attribute, nodeid in the clusternode line for node-01.example.com. The value is a negative value ("-1") instead of a positive value ("1"). For the nodeid attribute, the value must be a positive value.

2.8. Considerations for NetworkManager

The use of NetworkManager is not supported on cluster nodes. If you have installed NetworkManager on your cluster nodes, you should either remove it or disable it.

Note

The cman service will not start if NetworkManager is either running or has been configured to run with the chkconfig command.

2.9. Considerations for Using Quorum Disk

Quorum Disk is a disk-based quorum daemon, qdiskd, that provides supplemental heuristics to determine node fitness. With heuristics you can determine factors that are important to the operation of the node in the event of a network partition. For example, in a four-node cluster with a 3:1 split, ordinarily, the three nodes automatically "win" because of the three-to-one majority. Under those circumstances, the one node is fenced. With qdiskd however, you can set up heuristics that allow the one node to win based on access to a critical resource (for example, a critical network path). If your cluster requires additional methods of determining node health, then you should configure qdiskd to meet those needs.

Note

Configuring qdiskd is not required unless you have special requirements for node health. An example of a special requirement is an "all-but-one" configuration. In an all-but-one configuration, qdiskd is configured to provide enough quorum votes to maintain quorum even though only one node is working.

Important

Overall, heuristics and other qdiskd parameters for your deployment depend on the site environment and special requirements needed. To understand the use of heuristics and other qdiskd parameters, refer to the qdisk(5) man page. If you require assistance understanding and using qdiskd for your site, contact an authorized Red Hat support representative.

If you need to use qdiskd, you should take into account the following considerations:

Cluster node votes: When using Quorum Disk, each cluster node must have one vote.
CMAN membership timeout value: The CMAN membership timeout value (the time a node needs to be unresponsive before CMAN considers that node to be dead, and not a member) should be at least two times that of the qdiskd membership timeout value. The reason is because the quorum daemon must detect failed nodes on its own, and can take much longer to do so than CMAN. The default value for CMAN membership timeout is 10 seconds. Other site-specific conditions may affect the relationship between the membership timeout values of CMAN and qdiskd. For assistance with adjusting the CMAN membership timeout value, contact an authorized Red Hat support representative.
Fencing: To ensure reliable fencing when using qdiskd, use power fencing. While other types of fencing can be reliable for clusters not configured with qdiskd, they are not reliable for a cluster configured with qdiskd.
Maximum nodes: A cluster configured with qdiskd supports a maximum of 16 nodes. The reason for the limit is because of scalability; increasing the node count increases the amount of synchronous I/O contention on the shared quorum disk device.
Quorum disk device: A quorum disk device should be a shared block device with concurrent read/write access by all nodes in a cluster. The minimum size of the block device is 10 Megabytes. Examples of shared block devices that can be used by qdiskd are a multi-port SCSI RAID array, a Fibre Channel RAID SAN, or a RAID-configured iSCSI target. You can create a quorum disk device with mkqdisk, the Cluster Quorum Disk Utility. For information about using the utility refer to the mkqdisk(8) man page.
Note
Using JBOD as a quorum disk is not recommended. A JBOD cannot provide dependable performance and therefore may not allow a node to write to it quickly enough. If a node is unable to write to a quorum disk device quickly enough, the node is falsely evicted from a cluster.

2.10. Red Hat High Availability Add-On and SELinux

The High Availability Add-On for Red Hat Enterprise Linux 6 supports SELinux in the enforcing state with the SELinux policy type set to targeted.

For more information about SELinux, refer to Deployment Guide for Red Hat Enterprise Linux 6.

2.11. Multicast Addresses

The nodes in a cluster communicate among each other using multicast addresses. Therefore, each network switch and associated networking equipment in the Red Hat High Availability Add-On must be configured to enable multicast addresses and support IGMP (Internet Group Management Protocol). Ensure that each network switch and associated networking equipment in the Red Hat High Availability Add-On are capable of supporting multicast addresses and IGMP; if they are, ensure that multicast addressing and IGMP are enabled. Without multicast and IGMP, not all nodes can participate in a cluster, causing the cluster to fail; use UDP unicast in these environments, as described in Section 2.12, "UDP Unicast Traffic".

Note

Procedures for configuring network switches and associated networking equipment vary according each product. Refer to the appropriate vendor documentation or other information about configuring network switches and associated networking equipment to enable multicast addresses and IGMP.

2.12. UDP Unicast Traffic

As of the Red Hat Enterprise Linux 6.2 release, the nodes in a cluster can communicate with each other using the UDP Unicast transport mechanism. It is recommended, however, that you use IP multicasting for the cluster network. UDP unicast is an alternative that can be used when IP multicasting is not available.

You can configure the Red Hat High-Availability Add-On to use UDP unicast by setting the cman transport="udpu" parameter in the cluster.conf configuration file. You can also specify Unicast from the Network Configuration page of the Conga user interface, as described in Section 3.5.3, "Network Configuration".

2.13. Considerations for `ricci`

For Red Hat Enterprise Linux 6, ricci replaces ccsd. Therefore, it is necessary that ricci is running in each cluster node to be able to propagate updated cluster configuration whether it is via the cman_tool version -r command, the ccs command, or the luci user interface server. You can start ricci by using service ricci start or by enabling it to start at boot time via chkconfig. For information on enabling IP ports for ricci, refer to Section 2.3.1, "Enabling IP Ports on Cluster Nodes".

For the Red Hat Enterprise Linux 6.1 release and later, using ricci requires a password the first time you propagate updated cluster configuration from any particular node. You set the ricci password as root after you install ricci on your system with the passwd ricci command, for user ricci.

2.14. Configuring Virtual Machines in a Clustered Environment

When you configure your cluster with virtual machine resources, you should use the rgmanager tools to start and stop the virtual machines. Using virsh to start the machine can result in the virtual machine running in more than one place, which can cause data corruption in the virtual machine.

To reduce the chances of administrators accidentally "double-starting" virtual machines by using both cluster and non-cluster tools in a clustered environment, you can configure your system by storing the virtual machine configuration files in a non-default location. Storing the virtual machine configuration files somewhere other than their default location makes it more difficult to accidentally start a virtual machine using virsh, as the configuration file will be unknown out of the box to virsh.

The non-default location for virtual machine configuration files may be anywhere. The advantage of using an NFS share or a shared GFS2 file system is that the administrator does not need to keep the configuration files in sync across the cluster members. However, it is also permissible to use a local directory as long as the administrator keeps the contents synchronized somehow cluster-wide.

In the cluster configuration, virtual machines may reference this non-default location by using the path attribute of a virtual machine resource. Note that the path attribute is a directory or set of directories separated by the colon ':' character, not a path to a specific file.

Warning

The libvirt-guests service should be disabled on all the nodes that are running rgmanager. If a virtual machine autostarts or resumes, this can result in the virtual machine running in more than one place, which can cause data corruption in the virtual machine.

For more information on the attributes of a virtual machine resources, refer to Table B.24, "Virtual Machine".

Cluster Administration

Chapter 2. Before Configuring the Red Hat High Availability Add-On

2.1. General Configuration Considerations

2.2. Compatible Hardware

2.3. Enabling IP Ports

2.3.1. Enabling IP Ports on Cluster Nodes

2.3.2. Enabling the IP Port for luci

2.3.3. Configuring the iptables Firewall to Allow Cluster Components

2.4. Configuring luci with /etc/sysconfig/luci

2.5. Configuring ACPI For Use with Integrated Fence Devices

2.5.1. Disabling ACPI Soft-Off with chkconfig Management

2.5.2. Disabling ACPI Soft-Off with the BIOS

2.5.3. Disabling ACPI Completely in the grub.conf File

2.6. Considerations for Configuring HA Services

2.7. Configuration Validation

2.8. Considerations for NetworkManager

2.9. Considerations for Using Quorum Disk

2.10. Red Hat High Availability Add-On and SELinux

2.11. Multicast Addresses

2.12. UDP Unicast Traffic

2.13. Considerations for ricci

2.14. Configuring Virtual Machines in a Clustered Environment

2.4. Configuring luci with `/etc/sysconfig/luci`

2.5.1. Disabling ACPI Soft-Off with `chkconfig` Management

2.5.3. Disabling ACPI Completely in the `grub.conf` File

2.13. Considerations for `ricci`