Chapter 8. Managing Red Hat High Availability Add-On With Command Line Tools

This chapter describes various administrative tasks for managing Red Hat High Availability Add-On and consists of the following sections:

Important

Make sure that your deployment of Red Hat High Availability Add-On meets your needs and can be supported. Consult with an authorized Red Hat representative to verify your configuration prior to deployment. In addition, allow time for a configuration burn-in period to test failure modes.

Important

This chapter references commonly used cluster.conf elements and attributes. For a comprehensive list and description of cluster.conf elements and attributes, refer to the cluster schema at /usr/share/cluster/cluster.rng, and the annotated schema at /usr/share/doc/cman-X.Y.ZZ/cluster_conf.html (for example /usr/share/doc/cman-3.0.12/cluster_conf.html).

Important

Certain procedure in this chapter call for using the cman_tool version -r command to propagate a cluster configuration throughout a cluster. Using that command requires that ricci is running.

Note

Procedures in this chapter, may include specific commands for some of the command-line tools listed in Appendix E, Command Line Tools Summary. For more information about all commands and variables, refer to the man page for each command-line tool.

8.1. Starting and Stopping the Cluster Software

You can start or stop cluster software on a node according to Section 8.1.1, "Starting Cluster Software" and Section 8.1.2, "Stopping Cluster Software". Starting cluster software on a node causes it to join the cluster; stopping the cluster software on a node causes it to leave the cluster.

8.1.1. Starting Cluster Software

To start the cluster software on a node, type the following commands in this order:

service cman start
service clvmd start, if CLVM has been used to create clustered volumes
service gfs2 start, if you are using Red Hat GFS2
service rgmanager start, if you using high-availability (HA) services (rgmanager).

For example:

[root@example-01 ~]# service cman startStarting cluster: Checking Network Manager... [  OK  ]   Global setup... [  OK  ]   Loading kernel modules...   [  OK  ]   Mounting configfs... [  OK  ]   Starting cman... [  OK  ]   Waiting for quorum...   [  OK  ]   Starting fenced...  [  OK  ]   Starting dlm_controld... [  OK  ]   Starting gfs_controld... [  OK  ]   Unfencing self...   [  OK  ]   Joining fence domain... [  OK  ][root@example-01 ~]# service clvmd startStarting clvmd: [  OK  ]Activating VG(s):   2 logical volume(s) in volume group "vg_example" now active   [  OK  ][root@example-01 ~]# service gfs2 startMounting GFS2 filesystem (/mnt/gfsA):  [  OK  ]Mounting GFS2 filesystem (/mnt/gfsB):  [  OK  ][root@example-01 ~]# service rgmanager startStarting Cluster Service Manager:  [  OK  ][root@example-01 ~]#

8.1.2. Stopping Cluster Software

To stop the cluster software on a node, type the following commands in this order:

service rgmanager stop, if you using high-availability (HA) services (rgmanager).
service gfs2 stop, if you are using Red Hat GFS2
umount -at gfs2, if you are using Red Hat GFS2 in conjunction with rgmanager, to ensure that any GFS2 files mounted during rgmanager startup (but not unmounted during shutdown) were also unmounted.
service clvmd stop, if CLVM has been used to create clustered volumes
service cman stop

For example:

[root@example-01 ~]# service rgmanager stopStopping Cluster Service Manager:  [  OK  ][root@example-01 ~]# service gfs2 stopUnmounting GFS2 filesystem (/mnt/gfsA): [  OK  ]Unmounting GFS2 filesystem (/mnt/gfsB): [  OK  ][root@example-01 ~]# umount -at gfs2[root@example-01 ~]# service clvmd stopSignaling clvmd to exit [  OK  ]clvmd terminated   [  OK  ][root@example-01 ~]# service cman stopStopping cluster: Leaving fence domain... [  OK  ]   Stopping gfs_controld... [  OK  ]   Stopping dlm_controld... [  OK  ]   Stopping fenced...  [  OK  ]   Stopping cman... [  OK  ]   Waiting for corosync to shutdown:   [  OK  ]   Unloading kernel modules... [  OK  ]   Unmounting configfs...  [  OK  ][root@example-01 ~]#

Note

Stopping cluster software on a node causes its HA services to fail over to another node. As an alternative to that, consider relocating or migrating HA services to another node before stopping cluster software. For information about managing HA services, refer to Section 8.3, "Managing High-Availability Services".

8.2. Deleting or Adding a Node

This section describes how to delete a node from a cluster and add a node to a cluster. You can delete a node from a cluster according to Section 8.2.1, "Deleting a Node from a Cluster"; you can add a node to a cluster according to Section 8.2.2, "Adding a Node to a Cluster".

8.2.1. Deleting a Node from a Cluster

Deleting a node from a cluster consists of shutting down the cluster software on the node to be deleted and updating the cluster configuration to reflect the change.

Important

If deleting a node from the cluster causes a transition from greater than two nodes to two nodes, you must restart the cluster software at each node after updating the cluster configuration file.

To delete a node from a cluster, perform the following steps:

At any node, use the clusvcadm utility to relocate, migrate, or stop each HA service running on the node that is being deleted from the cluster. For information about using clusvcadm, refer to Section 8.3, "Managing High-Availability Services".

At the node to be deleted from the cluster, stop the cluster software according to Section 8.1.2, "Stopping Cluster Software". For example:

[root@example-01 ~]# service rgmanager stopStopping Cluster Service Manager:  [  OK  ][root@example-01 ~]# service gfs2 stopUnmounting GFS2 filesystem (/mnt/gfsA): [  OK  ]Unmounting GFS2 filesystem (/mnt/gfsB): [  OK  ][root@example-01 ~]# service clvmd stopSignaling clvmd to exit [  OK  ]clvmd terminated   [  OK  ][root@example-01 ~]# service cman stopStopping cluster: Leaving fence domain... [  OK  ]   Stopping gfs_controld... [  OK  ]   Stopping dlm_controld... [  OK  ]   Stopping fenced...  [  OK  ]   Stopping cman... [  OK  ]   Waiting for corosync to shutdown:   [  OK  ]   Unloading kernel modules... [  OK  ]   Unmounting configfs...  [  OK  ][root@example-01 ~]#

At any node in the cluster, edit the /etc/cluster/cluster.conf to remove the clusternode section of the node that is to be deleted. For example, in Example 8.1, "Three-node Cluster Configuration", if node-03.example.com is supposed to be removed, then delete the clusternode section for that node. If removing a node (or nodes) causes the cluster to be a two-node cluster, you can add the following line to the configuration file to allow a single node to maintain quorum (for example, if one node fails):
<cman two_node="1" expected_votes="1"/>
Refer to Section 8.2.3, "Examples of Three-Node and Two-Node Configurations" for comparison between a three-node and a two-node configuration.
Update the config_version attribute by incrementing its value (for example, changing from config_version="2" to config_version="3">).
Save /etc/cluster/cluster.conf.
(Optional) Validate the updated file against the cluster schema (cluster.rng) by running the ccs_config_validate command. For example:
```
[root@example-01 ~]# ccs_config_validate Configuration validates
```
Run the cman_tool version -r command to propagate the configuration to the rest of the cluster nodes.
Verify that the updated configuration file has been propagated.

If the node count of the cluster has transitioned from greater than two nodes to two nodes, you must restart the cluster software as follows:

At each node, stop the cluster software according to Section 8.1.2, "Stopping Cluster Software". For example:

[root@example-01 ~]# service rgmanager stopStopping Cluster Service Manager:  [  OK  ][root@example-01 ~]# service gfs2 stopUnmounting GFS2 filesystem (/mnt/gfsA): [  OK  ]Unmounting GFS2 filesystem (/mnt/gfsB): [  OK  ][root@example-01 ~]# service clvmd stopSignaling clvmd to exit [  OK  ]clvmd terminated   [  OK  ][root@example-01 ~]# service cman stopStopping cluster: Leaving fence domain... [  OK  ]   Stopping gfs_controld... [  OK  ]   Stopping dlm_controld... [  OK  ]   Stopping fenced...  [  OK  ]   Stopping cman... [  OK  ]   Waiting for corosync to shutdown:   [  OK  ]   Unloading kernel modules... [  OK  ]   Unmounting configfs...  [  OK  ][root@example-01 ~]#

At each node, start the cluster software according to Section 8.1.1, "Starting Cluster Software". For example:

[root@example-01 ~]# service cman startStarting cluster: Checking Network Manager... [  OK  ]   Global setup... [  OK  ]   Loading kernel modules...   [  OK  ]   Mounting configfs... [  OK  ]   Starting cman... [  OK  ]   Waiting for quorum...   [  OK  ]   Starting fenced...  [  OK  ]   Starting dlm_controld... [  OK  ]   Starting gfs_controld... [  OK  ]   Unfencing self...   [  OK  ]   Joining fence domain... [  OK  ][root@example-01 ~]# service clvmd startStarting clvmd: [  OK  ]Activating VG(s):   2 logical volume(s) in volume group "vg_example" now active   [  OK  ][root@example-01 ~]# service gfs2 startMounting GFS2 filesystem (/mnt/gfsA):  [  OK  ]Mounting GFS2 filesystem (/mnt/gfsB):  [  OK  ][root@example-01 ~]# service rgmanager startStarting Cluster Service Manager:  [  OK  ][root@example-01 ~]#

At any cluster node, run cman_tool nodes to verify that the nodes are functioning as members in the cluster (signified as "M" in the status column, "Sts"). For example:

[root@example-01 ~]# cman_tool nodesNode  Sts   Inc   Joined   Name   1   M 548   2010-09-28 10:52:21  node-01.example.com   2   M 548   2010-09-28 10:52:21  node-02.example.com

At any node, using the clustat utility, verify that the HA services are running as expected. In addition, clustat displays status of the cluster nodes. For example:

[root@example-01 ~]#clustatCluster Status for mycluster @ Wed Nov 17 05:40:00 2010Member Status: Quorate Member Name ID   Status ------ ---- ---- ------ node-02.example.com 2 Online, rgmanager node-01.example.com 1 Online, Local, rgmanager Service Name   Owner (Last)   State  ------- ----   ----- ------   ----- service:example_apache node-01.example.com started service:example_apache2 (none) disabled

8.2.2. Adding a Node to a Cluster

Adding a node to a cluster consists of updating the cluster configuration, propagating the updated configuration to the node to be added, and starting the cluster software on that node. To add a node to a cluster, perform the following steps:

At any node in the cluster, edit the /etc/cluster/cluster.conf to add a clusternode section for the node that is to be added. For example, in Example 8.2, "Two-node Cluster Configuration", if node-03.example.com is supposed to be added, then add a clusternode section for that node. If adding a node (or nodes) causes the cluster to transition from a two-node cluster to a cluster with three or more nodes, remove the following cman attributes from /etc/cluster/cluster.conf:
- cman two_node="1"
- expected_votes="1"
Refer to Section 8.2.3, "Examples of Three-Node and Two-Node Configurations" for comparison between a three-node and a two-node configuration.
Update the config_version attribute by incrementing its value (for example, changing from config_version="2" to config_version="3">).
Save /etc/cluster/cluster.conf.
(Optional) Validate the updated file against the cluster schema (cluster.rng) by running the ccs_config_validate command. For example:
```
[root@example-01 ~]# ccs_config_validate Configuration validates
```
Run the cman_tool version -r command to propagate the configuration to the rest of the cluster nodes.
Verify that the updated configuration file has been propagated.
Propagate the updated configuration file to /etc/cluster/ in each node to be added to the cluster. For example, use the scp command to send the updated configuration file to each node to be added to the cluster.

If the node count of the cluster has transitioned from two nodes to greater than two nodes, you must restart the cluster software in the existing cluster nodes as follows:

At each node, stop the cluster software according to Section 8.1.2, "Stopping Cluster Software". For example:

[root@example-01 ~]# service rgmanager stopStopping Cluster Service Manager:  [  OK  ][root@example-01 ~]# service gfs2 stopUnmounting GFS2 filesystem (/mnt/gfsA): [  OK  ]Unmounting GFS2 filesystem (/mnt/gfsB): [  OK  ][root@example-01 ~]# service clvmd stopSignaling clvmd to exit [  OK  ]clvmd terminated   [  OK  ][root@example-01 ~]# service cman stopStopping cluster: Leaving fence domain... [  OK  ]   Stopping gfs_controld... [  OK  ]   Stopping dlm_controld... [  OK  ]   Stopping fenced...  [  OK  ]   Stopping cman... [  OK  ]   Waiting for corosync to shutdown:   [  OK  ]   Unloading kernel modules... [  OK  ]   Unmounting configfs...  [  OK  ][root@example-01 ~]#

At each node, start the cluster software according to Section 8.1.1, "Starting Cluster Software". For example:

[root@example-01 ~]# service cman startStarting cluster: Checking Network Manager... [  OK  ]   Global setup... [  OK  ]   Loading kernel modules...   [  OK  ]   Mounting configfs... [  OK  ]   Starting cman... [  OK  ]   Waiting for quorum...   [  OK  ]   Starting fenced...  [  OK  ]   Starting dlm_controld... [  OK  ]   Starting gfs_controld... [  OK  ]   Unfencing self...   [  OK  ]   Joining fence domain... [  OK  ][root@example-01 ~]# service clvmd startStarting clvmd: [  OK  ]Activating VG(s):   2 logical volume(s) in volume group "vg_example" now active   [  OK  ][root@example-01 ~]# service gfs2 startMounting GFS2 filesystem (/mnt/gfsA):  [  OK  ]Mounting GFS2 filesystem (/mnt/gfsB):  [  OK  ][root@example-01 ~]# service rgmanager startStarting Cluster Service Manager:  [  OK  ][root@example-01 ~]#

At each node to be added to the cluster, start the cluster software according to Section 8.1.1, "Starting Cluster Software". For example:

[root@example-01 ~]# service cman startStarting cluster: Checking Network Manager... [  OK  ]   Global setup... [  OK  ]   Loading kernel modules...   [  OK  ]   Mounting configfs... [  OK  ]   Starting cman... [  OK  ]   Waiting for quorum...   [  OK  ]   Starting fenced...  [  OK  ]   Starting dlm_controld... [  OK  ]   Starting gfs_controld... [  OK  ]   Unfencing self...   [  OK  ]   Joining fence domain... [  OK  ][root@example-01 ~]# service clvmd startStarting clvmd: [  OK  ]Activating VG(s):   2 logical volume(s) in volume group "vg_example" now active   [  OK  ][root@example-01 ~]# service gfs2 startMounting GFS2 filesystem (/mnt/gfsA):  [  OK  ]Mounting GFS2 filesystem (/mnt/gfsB):  [  OK  ][root@example-01 ~]# service rgmanager startStarting Cluster Service Manager:  [  OK  ][root@example-01 ~]#

At any node, using the clustat utility, verify that each added node is running and part of the cluster. For example:

[root@example-01 ~]#clustatCluster Status for mycluster @ Wed Nov 17 05:40:00 2010Member Status: Quorate Member Name ID   Status ------ ---- ---- ------ node-03.example.com 3 Online, rgmanager node-02.example.com 2 Online, rgmanager node-01.example.com 1 Online, Local, rgmanager Service Name   Owner (Last)   State  ------- ----   ----- ------   ----- service:example_apache node-01.example.com started service:example_apache2 (none) disabled

For information about using clustat, refer to Section 8.3, "Managing High-Availability Services".

In addition, you can use cman_tool status to verify node votes, node count, and quorum count. For example:

[root@example-01 ~]#cman_tool statusVersion: 6.2.0Config Version: 19Cluster Name: mycluster Cluster Id: 3794Cluster Member: YesCluster Generation: 548Membership state: Cluster-MemberNodes: 3Expected votes: 3Total votes: 3Node votes: 1Quorum: 2  Active subsystems: 9Flags: Ports Bound: 0 11 177  Node name: node-01.example.comNode ID: 3Multicast addresses: 239.192.14.224 Node addresses: 10.15.90.58

At any node, you can use the clusvcadm utility to migrate or relocate a running service to the newly joined node. Also, you can enable any disabled services. For information about using clusvcadm, refer to Section 8.3, "Managing High-Availability Services"

8.2.3. Examples of Three-Node and Two-Node Configurations

Refer to the examples that follow for comparison between a three-node and a two-node configuration.

Example 8.1. Three-node Cluster Configuration

<cluster name="mycluster" config_version="3">   <cman/>   <clusternodes> <clusternode name="node-01.example.com" nodeid="1"> <fence> <method name="APC">  <device name="apc" port="1"/> </method> </fence> </clusternode> <clusternode name="node-02.example.com" nodeid="2"> <fence> <method name="APC">  <device name="apc" port="2"/> </method> </fence> </clusternode> <clusternode name="node-03.example.com" nodeid="3"> <fence> <method name="APC">  <device name="apc" port="3"/> </method> </fence> </clusternode>   </clusternodes>   <fencedevices> <fencedevice agent="fence_apc" ipaddr="apc_ip_example" login="login_example" name="apc" passwd="password_example"/>   </fencedevices>   <rm>   <failoverdomains>   <failoverdomain name="example_pri" nofailback="0" ordered="1" restricted="0">   <failoverdomainnode name="node-01.example.com" priority="1"/>   <failoverdomainnode name="node-02.example.com" priority="2"/>   <failoverdomainnode name="node-03.example.com" priority="3"/>   </failoverdomain>   </failoverdomains>   <resources>   <fs name="web_fs" device="/dev/sdd2" mountpoint="/var/www" fstype="ext3"/>   <ip address="127.143.131.100" monitor_link="yes" sleeptime="10"/>   <apache config_file="conf/httpd.conf" name="example_server" server_root="/etc/httpd" shutdown_wait="0"/>   </resources>   <service autostart="0" domain="example_pri" exclusive="0" name="example_apache" recovery="relocate">   <fs ref="web_fs"/>   <ip ref="127.143.131.100"/>   <apache ref="example_server"/>   </service>   <service autostart="0" domain="example_pri" exclusive="0" name="example_apache2" recovery="relocate">   <fs name="web_fs2" device="/dev/sdd3" mountpoint="/var/www" fstype="ext3"/>   <ip address="127.143.131.101" monitor_link="yes" sleeptime="10"/>   <apache config_file="conf/httpd.conf" name="example_server2" server_root="/etc/httpd" shutdown_wait="0"/>   </service>   </rm></cluster>

Example 8.2. Two-node Cluster Configuration

<cluster name="mycluster" config_version="3">   <cman two_node="1" expected_votes="1"/>   <clusternodes> <clusternode name="node-01.example.com" nodeid="1"> <fence> <method name="APC">  <device name="apc" port="1"/> </method> </fence> </clusternode> <clusternode name="node-02.example.com" nodeid="2"> <fence> <method name="APC">  <device name="apc" port="2"/> </method> </fence>   </clusternodes>   <fencedevices> <fencedevice agent="fence_apc" ipaddr="apc_ip_example" login="login_example" name="apc" passwd="password_example"/>   </fencedevices>   <rm>   <failoverdomains>   <failoverdomain name="example_pri" nofailback="0" ordered="1" restricted="0">   <failoverdomainnode name="node-01.example.com" priority="1"/>   <failoverdomainnode name="node-02.example.com" priority="2"/>   </failoverdomain>   </failoverdomains>   <resources>   <fs name="web_fs" device="/dev/sdd2" mountpoint="/var/www" fstype="ext3"/>   <ip address="127.143.131.100" monitor_link="yes" sleeptime="10"/>   <apache config_file="conf/httpd.conf" name="example_server" server_root="/etc/httpd" shutdown_wait="0"/>   </resources>   <service autostart="0" domain="example_pri" exclusive="0" name="example_apache" recovery="relocate">   <fs ref="web_fs"/>   <ip ref="127.143.131.100"/>   <apache ref="example_server"/>   </service>   <service autostart="0" domain="example_pri" exclusive="0" name="example_apache2" recovery="relocate">   <fs name="web_fs2" device="/dev/sdd3" mountpoint="/var/www" fstype="ext3"/>   <ip address="127.143.131.101" monitor_link="yes" sleeptime="10"/>   <apache config_file="conf/httpd.conf" name="example_server2" server_root="/etc/httpd" shutdown_wait="0"/>   </service>   </rm></cluster>

8.3. Managing High-Availability Services

You can manage high-availability services using the Cluster Status Utility, clustat, and the Cluster User Service Administration Utility, clusvcadm. clustat displays the status of a cluster and clusvcadm provides the means to manage high-availability services.

This section provides basic information about managing HA services using the clustat and clusvcadm commands. It consists of the following subsections:

8.3.1. Displaying HA Service Status with `clustat`

clustat displays cluster-wide status. It shows membership information, quorum view, the state of all high-availability services, and indicates which node the clustat command is being run at (Local). Table 8.1, "Services Status" describes the states that services can be in and are displayed when running clustat. Example 8.3, "clustat Display" shows an example of a clustat display. For more detailed information about running the clustat command refer to the clustat man page.

Table 8.1. Services Status

Services Status	Description
Started	The service resources are configured and available on the cluster system that owns the service.
Recovering	The service is pending start on another node.
Disabled	The service has been disabled, and does not have an assigned owner. A disabled service is never restarted automatically by the cluster.
Stopped	In the stopped state, the service will be evaluated for starting after the next service or node transition. This is a temporary state. You may disable or enable the service from this state.
Failed	The service is presumed dead. A service is placed into this state whenever a resource's stop operation fails. After a service is placed into this state, you must verify that there are no resources allocated (mounted file systems, for example) prior to issuing a `disable` request. The only operation that can take place when a service has entered this state is `disable`.
Uninitialized	This state can appear in certain cases during startup and running `clustat -f`.

Example 8.3. clustat Display

[root@example-01 ~]#clustatCluster Status for mycluster @ Wed Nov 17 05:40:15 2010Member Status: Quorate Member Name ID   Status ------ ---- ---- ------ node-03.example.com 3 Online, rgmanager node-02.example.com 2 Online, rgmanager node-01.example.com 1 Online, Local, rgmanager Service Name   Owner (Last)   State  ------- ----   ----- ------   ----- service:example_apache node-01.example.com started service:example_apache2 (none) disabled

8.3.2. Managing HA Services with `clusvcadm`

You can manage HA services using the clusvcadm command. With it you can perform the following operations:

Enable and start a service.
Disable a service.
Stop a service.
Freeze a service
Unfreeze a service
Migrate a service (for virtual machine services only)
Relocate a service.
Restart a service.

Table 8.2, "Service Operations" describes the operations in more detail. For a complete description on how to perform those operations, refer to the clusvcadm utility man page.

Table 8.2. Service Operations

Service Operation	Description	Command Syntax
Enable	Start the service, optionally on a preferred target and optionally according to failover domain rules. In the absence of either a preferred target or failover domain rules, the local host where `clusvcadm` is run will start the service. If the original start fails, the service behaves as though a relocate operation was requested (refer to Relocate in this table). If the operation succeeds, the service is placed in the started state.	`clusvcadm -e <service_name>` or `clusvcadm -e <service_name> -m <member>` (Using the -m option specifies the preferred target member on which to start the service.)
Disable	Stop the service and place into the disabled state. This is the only permissible operation when a service is in the failed state.	`clusvcadm -d <service_name>`
Relocate	Move the service to another node. Optionally, you may specify a preferred node to receive the service, but the inability of the service to run on that host (for example, if the service fails to start or the host is offline) does not prevent relocation, and another node is chosen. `rgmanager` attempts to start the service on every permissible node in the cluster. If no permissible target node in the cluster successfully starts the service, the relocation fails and the service is attempted to be restarted on the original owner. If the original owner cannot restart the service, the service is placed in the stopped state.	`clusvcadm -r <service_name>` or `clusvcadm -r <service_name> -m <member>` (Using the -m option specifies the preferred target member on which to start the service.)
Stop	Stop the service and place into the stopped state.	`clusvcadm -s <service_name>`
Freeze	Freeze a service on the node where it is currently running. This prevents status checks of the service as well as failover in the event the node fails or rgmanager is stopped. This can be used to suspend a service to allow maintenance of underlying resources. Refer to the section called "Considerations for Using the Freeze and Unfreeze Operations" for important information about using the freeze and unfreeze operations.	`clusvcadm -Z <service_name>`
Unfreeze	Unfreeze takes a service out of the freeze state. This re-enables status checks. Refer to the section called "Considerations for Using the Freeze and Unfreeze Operations" for important information about using the freeze and unfreeze operations.	`clusvcadm -U <service_name>`
Migrate	Migrate a virtual machine to another node. You must specify a target node. Depending on the failure, a failure to migrate may result with the virtual machine in the failed state or in the started state on the original owner.	`clusvcadm -M <service_name> -m <member>` Important For the migrate operation, you must specify a target node using the `-m <member>` option.
Restart	Restart a service on the node where it is currently running.	`clusvcadm -R <service_name>`

Considerations for Using the Freeze and Unfreeze Operations

Using the freeze operation allows maintenance of parts of rgmanager services. For example, if you have a database and a web server in one rgmanager service, you may freeze the rgmanager service, stop the database, perform maintenance, restart the database, and unfreeze the service.

When a service is frozen, it behaves as follows:

Status checks are disabled.
Start operations are disabled.
Stop operations are disabled.
Failover will not occur (even if you power off the service owner).

Important

Failure to follow these guidelines may result in resources being allocated on multiple hosts:

You must not stop all instances of rgmanager when a service is frozen unless you plan to reboot the hosts prior to restarting rgmanager.
You must not unfreeze a service until the reported owner of the service rejoins the cluster and restarts rgmanager.

8.4. Updating a Configuration

Updating the cluster configuration consists of editing the cluster configuration file (/etc/cluster/cluster.conf) and propagating it to each node in the cluster. You can update the configuration using either of the following procedures:

8.4.1. Updating a Configuration Using `cman_tool version -r`

To update the configuration using the cman_tool version -r command, perform the following steps:

At any node in the cluster, edit the /etc/cluster/cluster.conf file.
Update the config_version attribute by incrementing its value (for example, changing from config_version="2" to config_version="3">).
Save /etc/cluster/cluster.conf.
Run the cman_tool version -r command to propagate the configuration to the rest of the cluster nodes. It is necessary that ricci be running in each cluster node to be able to propagate updated cluster configuration information.
Verify that the updated configuration file has been propagated.

You may skip this step (restarting cluster software) if you have made only the following configuration changes:

Deleting a node from the cluster configuration-except where the node count changes from greater than two nodes to two nodes. For information about deleting a node from a cluster and transitioning from greater than two nodes to two nodes, refer to Section 8.2, "Deleting or Adding a Node".
Adding a node to the cluster configuration-except where the node count changes from two nodes to greater than two nodes. For information about adding a node to a cluster and transitioning from two nodes to greater than two nodes, refer to Section 8.2.2, "Adding a Node to a Cluster".
Changes to how daemons log information.
HA service/VM maintenance (adding, editing, or deleting).
Resource maintenance (adding, editing, or deleting).
Failover domain maintenance (adding, editing, or deleting).

Otherwise, you must restart the cluster software as follows:

At each node, stop the cluster software according to Section 8.1.2, "Stopping Cluster Software". For example:

[root@example-01 ~]# service rgmanager stopStopping Cluster Service Manager:  [  OK  ][root@example-01 ~]# service gfs2 stopUnmounting GFS2 filesystem (/mnt/gfsA): [  OK  ]Unmounting GFS2 filesystem (/mnt/gfsB): [  OK  ][root@example-01 ~]# service clvmd stopSignaling clvmd to exit [  OK  ]clvmd terminated   [  OK  ][root@example-01 ~]# service cman stopStopping cluster: Leaving fence domain... [  OK  ]   Stopping gfs_controld... [  OK  ]   Stopping dlm_controld... [  OK  ]   Stopping fenced...  [  OK  ]   Stopping cman... [  OK  ]   Waiting for corosync to shutdown:   [  OK  ]   Unloading kernel modules... [  OK  ]   Unmounting configfs...  [  OK  ][root@example-01 ~]#

At each node, start the cluster software according to Section 8.1.1, "Starting Cluster Software". For example:

[root@example-01 ~]# service cman startStarting cluster: Checking Network Manager... [  OK  ]   Global setup... [  OK  ]   Loading kernel modules...   [  OK  ]   Mounting configfs... [  OK  ]   Starting cman... [  OK  ]   Waiting for quorum...   [  OK  ]   Starting fenced...  [  OK  ]   Starting dlm_controld... [  OK  ]   Starting gfs_controld... [  OK  ]   Unfencing self...   [  OK  ]   Joining fence domain... [  OK  ][root@example-01 ~]# service clvmd startStarting clvmd: [  OK  ]Activating VG(s):   2 logical volume(s) in volume group "vg_example" now active   [  OK  ][root@example-01 ~]# service gfs2 startMounting GFS2 filesystem (/mnt/gfsA):  [  OK  ]Mounting GFS2 filesystem (/mnt/gfsB):  [  OK  ][root@example-01 ~]# service rgmanager startStarting Cluster Service Manager:  [  OK  ][root@example-01 ~]#

Stopping and starting the cluster software ensures that any configuration changes that are checked only at startup time are included in the running configuration.

At any cluster node, run cman_tool nodes to verify that the nodes are functioning as members in the cluster (signified as "M" in the status column, "Sts"). For example:

[root@example-01 ~]# cman_tool nodesNode  Sts   Inc   Joined   Name   1   M 548   2010-09-28 10:52:21  node-01.example.com   2   M 548   2010-09-28 10:52:21  node-02.example.com   3   M 544   2010-09-28 10:52:21  node-03.example.com

At any node, using the clustat utility, verify that the HA services are running as expected. In addition, clustat displays status of the cluster nodes. For example:

[root@example-01 ~]#clustatCluster Status for mycluster @ Wed Nov 17 05:40:00 2010Member Status: Quorate Member Name ID   Status ------ ---- ---- ------ node-03.example.com 3 Online, rgmanager node-02.example.com 2 Online, rgmanager node-01.example.com 1 Online, Local, rgmanager Service Name   Owner (Last)   State  ------- ----   ----- ------   ----- service:example_apache node-01.example.com started service:example_apache2 (none) disabled

If the cluster is running as expected, you are done updating the configuration.

8.4.2. Updating a Configuration Using `scp`

To update the configuration using the scp command, perform the following steps:

At each node, stop the cluster software according to Section 8.1.2, "Stopping Cluster Software". For example:

[root@example-01 ~]# service rgmanager stopStopping Cluster Service Manager:  [  OK  ][root@example-01 ~]# service gfs2 stopUnmounting GFS2 filesystem (/mnt/gfsA): [  OK  ]Unmounting GFS2 filesystem (/mnt/gfsB): [  OK  ][root@example-01 ~]# service clvmd stopSignaling clvmd to exit [  OK  ]clvmd terminated   [  OK  ][root@example-01 ~]# service cman stopStopping cluster: Leaving fence domain... [  OK  ]   Stopping gfs_controld... [  OK  ]   Stopping dlm_controld... [  OK  ]   Stopping fenced...  [  OK  ]   Stopping cman... [  OK  ]   Waiting for corosync to shutdown:   [  OK  ]   Unloading kernel modules... [  OK  ]   Unmounting configfs...  [  OK  ][root@example-01 ~]#

At any node in the cluster, edit the /etc/cluster/cluster.conf file.
Update the config_version attribute by incrementing its value (for example, changing from config_version="2" to config_version="3">).
Save /etc/cluster/cluster.conf.
Validate the updated file against the cluster schema (cluster.rng) by running the ccs_config_validate command. For example:
```
[root@example-01 ~]# ccs_config_validate Configuration validates
```
If the updated file is valid, use the scp command to propagate it to /etc/cluster/ in each cluster node.
Verify that the updated configuration file has been propagated.

At each node, start the cluster software according to Section 8.1.1, "Starting Cluster Software". For example:

[root@example-01 ~]# service cman startStarting cluster: Checking Network Manager... [  OK  ]   Global setup... [  OK  ]   Loading kernel modules...   [  OK  ]   Mounting configfs... [  OK  ]   Starting cman... [  OK  ]   Waiting for quorum...   [  OK  ]   Starting fenced...  [  OK  ]   Starting dlm_controld... [  OK  ]   Starting gfs_controld... [  OK  ]   Unfencing self...   [  OK  ]   Joining fence domain... [  OK  ][root@example-01 ~]# service clvmd startStarting clvmd: [  OK  ]Activating VG(s):   2 logical volume(s) in volume group "vg_example" now active   [  OK  ][root@example-01 ~]# service gfs2 startMounting GFS2 filesystem (/mnt/gfsA):  [  OK  ]Mounting GFS2 filesystem (/mnt/gfsB):  [  OK  ][root@example-01 ~]# service rgmanager startStarting Cluster Service Manager:  [  OK  ][root@example-01 ~]#

At any cluster node, run cman_tool nodes to verify that the nodes are functioning as members in the cluster (signified as "M" in the status column, "Sts"). For example:

[root@example-01 ~]# cman_tool nodesNode  Sts   Inc   Joined   Name   1   M 548   2010-09-28 10:52:21  node-01.example.com   2   M 548   2010-09-28 10:52:21  node-02.example.com   3   M 544   2010-09-28 10:52:21  node-03.example.com

At any node, using the clustat utility, verify that the HA services are running as expected. In addition, clustat displays status of the cluster nodes. For example:

[root@example-01 ~]#clustatCluster Status for mycluster @ Wed Nov 17 05:40:00 2010Member Status: Quorate Member Name ID   Status ------ ---- ---- ------ node-03.example.com 3 Online, rgmanager node-02.example.com 2 Online, rgmanager node-01.example.com 1 Online, Local, rgmanager Service Name   Owner (Last)   State  ------- ----   ----- ------   ----- service:example_apache node-01.example.com started service:example_apache2 (none) disabled

If the cluster is running as expected, you are done updating the configuration.

Cluster Administration

Chapter 8. Managing Red Hat High Availability Add-On With Command Line Tools

8.1. Starting and Stopping the Cluster Software

8.1.1. Starting Cluster Software

8.1.2. Stopping Cluster Software

8.2. Deleting or Adding a Node

8.2.1. Deleting a Node from a Cluster

8.2.2. Adding a Node to a Cluster

8.2.3. Examples of Three-Node and Two-Node Configurations

8.3. Managing High-Availability Services

8.3.1. Displaying HA Service Status with clustat

8.3.2. Managing HA Services with clusvcadm

Considerations for Using the Freeze and Unfreeze Operations

8.4. Updating a Configuration

8.4.1. Updating a Configuration Using cman_tool version -r

8.4.2. Updating a Configuration Using scp

8.3.1. Displaying HA Service Status with `clustat`

8.3.2. Managing HA Services with `clusvcadm`

8.4.1. Updating a Configuration Using `cman_tool version -r`

8.4.2. Updating a Configuration Using `scp`