Lesson 8
Online Configuration
8–2VCS 5.1 for UNIX: Install and Configure
Copyright ? 2009 Symantec Corporation. All rights reserved.
Lesson 8 Online Configuration8–3
Copyright ? 2009 Symantec Corporation. All rights reserved.8
Online service group configuration
The chart on the left in the diagram illustrates the high-level procedure you can use to modify the cluster configuration while VCS is running.
Online configuration procedure
You can use the procedures shown in the diagram as a standard methodology for creating service groups and resources. Although there are many ways you could
vary this configuration procedure, following a recommended practice simplifies
and streamlines the initial configuration and facilitates troubleshooting if you
encounter configuration problems.
Adding a service group using the GUI
The minimum required information to create a service group is:
?Enter a unique name. Using a consistent naming scheme helps identify the purpose of the service group and all associated resources.
?Specify the list of systems on which the service group can run.
This is defined in the SystemList attribute for the service group, as displayed in
the excerpt from the sample main.cf file. A priority number is associated
with each system to determine the order systems are selected for failover. The
lower-numbered system is selected first.
?The Startup box specifies that the service group starts automatically when VCS starts on the system, if the service group is not already online elsewhere in the
cluster. This is defined by the AutoStartList attribute of the service group. In
the example displayed in the slide, the S1 system is selected as the system on
which DemoSG is started when VCS starts up.
?The Service Group Type selection is Failover by default.
If you save the configuration after creating the service group, you can view the
main.cf file to see the effect of had modifying the configuration and writing the changes to the local disk.
Note:You can click the Show Command button to see the commands that are run when you click OK.
8–4VCS 5.1 for UNIX: Install and Configure
Copyright ? 2009 Symantec Corporation. All rights reserved.
Lesson 8 Online Configuration8–5
Copyright ? 2009 Symantec Corporation. All rights reserved.8
Adding a service group using the CLI
You can also use the VCS command-line interface to modify a running cluster configuration. The next example shows how to use hagrp commands to add the DemoSG service group and modify its attributes.
haconf –makerw
hagrp –add DemoSG
hagrp –modify DemoSG SystemList S1 0 S2 1
hagrp –modify DemoSG AutoStartList S1
haconf –dump -makero
The corresponding main.cf excerpt for DemoSG is shown in the slide.
Notice that the main.cf definition for the DemoSG service group does not include the Parallel attribute. When a default value is specified for a resource, the attribute is not written to the main.cf file. To display all values for all attributes:?In the GUI, select the object (resource, service group, system, or cluster), click the Properties tag, and click Show all attributes.
?From the command line, use the -display option to the corresponding ha command. For example:
hagrp -display DemoSG
See the command-line reference card provided with this course for a list of commonly used ha commands.
8–6VCS 5.1 for UNIX: Install and Configure Copyright ? 2009 Symantec Corporation. All rights reserved.
Adding resources
Online resource configuration procedure
Add resources to a service group in the order of resource dependencies starting from the child resource (bottom up). This enables each resource to be tested as it is added to the service group.
Adding a resource requires you to specify:
?The service group name
?The unique resource name
If you prefix the resource name with the service group name, you can more
easily identify the service group to which it belongs. When you display a list of resources from the command line using the hares -list command, the
resources are sorted alphabetically.
?The resource type
?Attribute values
Use the procedure shown in the diagram to configure a resource.
Notes:
?It is recommended that you set each resource to be non-critical during initial
configuration. This simplifies testing and troubleshooting in the event that you have specified incorrect configuration information. If a resource faults due to a configuration error, the service group does not fail over if resources are non-
critical.
?
Enabling a resource signals the agent to start monitoring the resource.
Lesson 8 Online Configuration8–7
Copyright ? 2009 Symantec Corporation. All rights reserved.8
Adding a Resource using the GUI: NIC example
The NIC resource has only one required attribute, Device, for all platforms other than HP-UX, which also requires NetworkHosts unless PingOptimize is set to 0. Optional attributes for NIC vary by platform. Refer to the Veritas Cluster Server Bundled Agents Reference Guide for a complete definition. These optional attributes are common to all platforms.
?NetworkType: Type of network, Ethernet (ether)
?PingOptimize: Number of monitor cycles to detect if the configured interface is inactive
A value of 1 optimizes broadcast pings and requires two monitor cycles. A
value of 0 performs a broadcast ping during each monitor cycle and detects the inactive interface within the cycle. The default is 1.
Note:On the HP-UX platform, if the PingOptimize attribute is set to 1, the monitor entry point does not send broadcast pings.
?NetworkHosts: The list of hosts on the network that are used to determine if the network connection is alive
It is recommended that you enter the IP address of the host rather than the host name to prevent the monitor cycle from timing out due to DNS problems. ?Example device attribute values:
Solaris: eri0; HP-UX: lan0; Linux: eth0; AIX: en0
Persistent resources
If you add a persistent resource as the first resource of a new service group, as
shown in the lab exercise for this lesson, notice that the service group status is
offline, even though the resource status is online.
Persistent resources are not considered when VCS reports service group status,
because they are always online. When a nonpersistent resource is added to the
group, such as IP, the service group status reflects the status of that nonpersistent resource.
8–8VCS 5.1 for UNIX: Install and Configure
Copyright ? 2009 Symantec Corporation. All rights reserved.
Lesson 8 Online Configuration8–9
Copyright ? 2009 Symantec Corporation. All rights reserved.8
Adding an IP resource
The slide shows the required attribute values for an IP resource (on Solaris) in the DemoSG service group. The corresponding entry is made in the main.cf file when the configuration is saved.
Notice that the IP resource on Solaris has two required attributes: Device and Address, which specify the network interface and IP address, respectively. The required attributes vary depending on the platform.
Optional Attributes
?NetMask: Netmask associated with the application IP address
–The value may be specified in decimal (base 10) or hexadecimal (base 16).
The default is the netmask corresponding to the IP address class.
–This is a required attribute on AIX.
?Options: Options to be used with the ifconfig command
?ArpDelay: Number of seconds to sleep between configuring an interface and sending out a broadcast to inform routers about this IP address
The default is 1 second.
?IfconfigTwice: If set to 1, this attribute causes an IP address to be configured twice, using an ifconfig up-down-up sequence. This behavior increases the probability of gratuitous ARPs (caused by ifconfig up) reaching clients.
The default is 0.
Adding a resource using the CLI: DiskGroup example
You can use the hares command to add a resource and configure the required
attributes. This example shows how to add a DiskGroup resource.
The DiskGroup resource
The DiskGroup resource has only one required attribute, DiskGroup, except on
Linux, which also requires StartV olumes and StopV olumes.
Notes:
?In versions prior to 4.0, VCS uses the vxdg with the -t option when importing a disk group to disable autoimport. This ensures that VCS controls
the disk group. VCS deports a disk group if it was manually imported without
the -t option (outside of VCS control).
?In version 4.1 and 5.0, VCS sets the vxdg autoimport option to no, which disables autoimporting of disk groups.
Example optional attributes:
?StartV olumes: Starts all volumes after importing the disk group
This also starts layered volumes by running vxrecover -s. The default is 1,
enabled, on all UNIX platforms except Linux.
?StopV olumes: Stops all volumes before deporting the disk group with vxvol The default is 1, enabled, on all UNIX platforms except Linux.
8–10VCS 5.1 for UNIX: Install and Configure
Copyright ? 2009 Symantec Corporation. All rights reserved.
Lesson 8 Online Configuration 8–11Copyright ? 2009 Symantec Corporation. All rights reserved.8
The Volume resource
The V olume resource can be used to manage a VxVM volume. Although the
V olume resource is not strictly required, it provides additional monitoring. You can use a DiskGroup resource to start volumes when the DiskGroup resource is
brought online. This has the effect of starting volumes more quickly, but only the disk group is monitored.
However, if you have a large number of volumes on a single disk group, the
DiskGroup resource can time out when trying to start or stop all the volumes
simultaneously. In this case, you can set the StartV olume and StopV olume
attributes of the DiskGroup to 0, and create V olume resources to start the volumes individually.
Also, if you are using volumes as raw devices with no file systems, and, therefore, no Mount resources, consider using V olume resources for the additional level of monitoring.
The V
olume resource has no optional attributes.
The Mount resource
The Mount resource has the required attributes displayed in the main.cf file
excerpt in the slide.
Example optional attributes:
?MountOpt: Specifies options for the mount command
When setting attributes with arguments starting with a dash (-), use the percent
(%) character to escape the arguments. Examples:
hares -modify DemoMount FsckOpt %-y
The percent character is an escape character for the VCS CLI which prevents
VCS from interpreting the string as an argument to hares.
?SnapUmount: Determines whether VxFS snapshots are unmounted when the file system is taken offline (unmounted)
The default is 0, meaning that snapshots are not automatically unmounted
when the file system is unmounted.
Note:If SnapUmount is set to 0 and a VxFS snapshot of the file system is mounted, the unmount operation fails when the resource is taken offline, and
the service group is not able to fail over.
This is desired behavior in some situations, such as when a backup is being
performed from the snapshot.
8–12VCS 5.1 for UNIX: Install and Configure
Copyright ? 2009 Symantec Corporation. All rights reserved.
Lesson 8 Online Configuration 8–13Copyright ? 2009 Symantec Corporation. All rights reserved.8
File system locking
Storage Foundation 5.0 MP3 introduced a new feature to enable a file system to be mounted with a key which must be used to unmount the file system. The Mount resource has a VxFSMountLock attribute to manage the file system mount key.This attribute is set to the “VCS” string by default when a Mount resource is
added. The Mount agent uses this key for online and offline operations to ensure the file system cannot be inadvertently unmounted outside of VCS control.
You can unlock a file system without unmounting by using the fsadm command:/opt/VRTS/bin/fsadm -o mntunlock="key "
mount_point_name
The Process resource
The Process resource controls the application and is added last because it requires all other resources to be online in order to start. The Process resource is used to
start, stop, and monitor the status of a process.
?Online: Starts the process specified in the PathName attribute, with options, if specified in the Arguments attribute
?Offline: Sends SIGTERM to the process
SIGKILL is sent if process does not exit within one second.
?Monitor: Determines if the process is running by scanning the process table
The optional Arguments attribute specifies any command-line options to use when starting the process.
8–14VCS 5.1 for UNIX: Install and Configure
Copyright ? 2009 Symantec Corporation. All rights reserved.
Lesson 8 Online Configuration8–15
Copyright ? 2009 Symantec Corporation. All rights reserved.8
Process attribute specification
?If the executable is a shell script, you must specify the script name followed by arguments. You must also specify the full path for the shell in the PathName attribute.
?The monitor script calls ps and matches the process name. The process name field is limited to 80 characters in the ps output. If you specify a path name to
a process that is longer than 80 characters, the monitor entry point fails.
Solving common configuration errors
Troubleshooting resources
Verify that each resource is online on the local system before continuing the
service group configuration procedure.
If you are unable to bring a resource online, use the procedure in the diagram to
find and fix the problem. You can view the logs through Cluster Manager or in the /var/VRTSvcs/logs directory if you need to determine the cause of errors.
VCS log entries are written to engine_A.log and agent entries are written to
resource_A.log files.
Note:Some resources do not need to be disabled and reenabled. Only resources whose agents have open and close entry points, such as MultiNICB, require
you to disable and enable again after fixing the problem. By contrast, a
Mount resource does not need to be disabled if, for example, you incorrectly
specify the MountPoint attribute.
However, it is generally good practice to disable and enable regardless because it is difficult to remember when it is required and when it is not. In addition, a resource is immediately monitored upon enabling, which would indicate potential problems with attribute specification.
More detail on performing tasks necessary for solving resource configuration
problems is provided in the following sections.
8–16VCS 5.1 for UNIX: Install and Configure
Copyright ? 2009 Symantec Corporation. All rights reserved.
Lesson 8 Online Configuration8–17
Copyright ? 2009 Symantec Corporation. All rights reserved.8
Flushing a service group
Occasionally, agents for the resources in a service group can appear to become suspended waiting for resources to be brought online or be taken offline. Generally, this condition occurs during initial configuration and testing because the required attributes for a resource are not defined properly or the underlying operating system resources are not prepared correctly. If it appears that a resource or group has become suspended while being brought online, you can flush the service group to enable corrective action.
Flushing a service group stops VCS from attempting to bring resources online and clears any internal wait states. You can then check resources for configuration problems or underlying operating system configuration problems, and then attempt to bring resources back online.
Note:Before flushing a service group, verify that the physical or software resource
is actually stopped.
Disabling and enabling a resource
Disable a resource before you start modifying attributes to fix a misconfigured
resource. When you disable a resource:
?VCS stops monitoring the resource, so it does not fault or wait to come online while you are making changes.
?The agent calls the close entry point, if defined. The close entry point is optional.
?When the close tasks are completed, or if there is no close entry point, the agent stops monitoring the resource.
When you enable a resource, VCS calls the agent to immediately monitor the
resource and then continues to periodically directs the agent to monitor the
resource.
8–18VCS 5.1 for UNIX: Install and Configure
Copyright ? 2009 Symantec Corporation. All rights reserved.
Lesson 8 Online Configuration 8–19Copyright ? 2009 Symantec Corporation. All rights reserved.8
Clearing resource faults
A fault indicates that the monitor entry point is reporting an unexpected offline state for a previously online resource. This indicates a problem with the underlying component being managed by the resource.
Before clearing a fault, you must resolve the problem that caused the fault. Use the VCS logs to help you determine which resource has faulted and why.
It is important to clear faults for critical resources after fixing underlying problems so that the system where the fault originally occurred can be a failover target for the service group. In a two-node cluster, a faulted critical resource would prevent the service group from failing back if another fault occurred. You can clear a
faulted resource on a particular system, or on all systems when the service group can run.
Note:Persistent resource faults should be probed to force the agent to monitor the resource immediately. Otherwise, the resource is not online until the next
OfflineMonitorInterval, up to five minutes.
Clearing and Probing Resources Using the CLI
?To clear a faulted resource, type:
hares -clear resource [-sys system ]
If the system name is not specified then the resource is cleared on all systems.?To probe a resource, type:
hares -probe resource -sys
system
8–20VCS 5.1 for UNIX: Install and Configure Copyright ? 2009 Symantec Corporation. All rights reserved.
Testing the service group
After you have successfully brought each resource online, link the resources and switch the service group to each system on which the service group can run.
Test procedure
For simplicity, the example service group uses the default Priority failover policy. That is, if a critical resource in DemoSG faults, the service group is taken offline and brought online on the system with the lowest priority value that is available for failover.
The “Handling Resource Faults” lesson provides additional information about configuring and testing failover behavior. Additional failover policies are also described in the Veritas Cluster Server for UNIX: Cluster Management
course.