this is obsolete doc -- see http://doc.nethence.com/ instead
MC/SG service package configuration
MC/Serviceguard on RHEL5 : http://pbraun.nethence.com/doc/sysutils/mcsg.html
MC/SG service package configuration : http://pbraun.nethence.com/doc/sysutils/mcsg_package.html
Configure a package
Create a package folder,
mkdir package1
chmod 700 package1
Generate the package configuration and script templates,
cmmakepkg -p package1/package1.conf.dist
cmmakepkg -s package1/package1.sh.dist
Wipe out the comments from the configuration,
cd package1
sed -e '
/^#/d;
/^$/d;
/^[[:space:]]*#/d;
' package1.conf.dist > package1.conf
cp package1.sh.dist package1.sh
chmod 700 package1.conf
chmod 700 package1.sh
cd ..
Edit the package configuration,
vi package1/package1.conf
like,
PACKAGE_NAME package1
PACKAGE_TYPE FAILOVER
NODE_NAME sg1
NODE_NAME sg2
AUTO_RUN YES
NODE_FAIL_FAST_ENABLED NO
RUN_SCRIPT /usr/local/cmcluster/conf/package1/package1.sh
HALT_SCRIPT /usr/local/cmcluster/conf/package1/package1.sh
RUN_SCRIPT_TIMEOUT 800
HALT_SCRIPT_TIMEOUT 800
SUCCESSOR_HALT_TIMEOUT 800
FAILOVER_POLICY CONFIGURED_NODE
FAILBACK_POLICY MANUAL
PRIORITY NO_PRIORITY
SERVICE_NAME lalaservice
SERVICE_FAIL_FAST_ENABLED no
SERVICE_HALT_TIMEOUT 300
Note. "NODE_NAME" may have "*" for all nodes
Note. it is a good idea to always specify RUN_SCRIPT_TIMEOUT and a HALT_SCRIPT_TIMEOUT.
Edit the package script,
vi package1/package1.sh
configure a virtual IP and a service,
# IP ADDRESSES
IP[0]="10.1.1.20"
SUBNET[0]="10.1.1.0"
# SERVICE NAMES AND COMMANDS.
SERVICE_NAME[0]="lalaservice"
SERVICE_CMD[0]="/usr/bin/xload -display 10.1.1.9 -label `uname -n`:lala"
SERVICE_RESTART[0]="-R"
Note. "10.1.1.9" here is a X capable workstation.
Note. "-R" for unlimited restarts, "-r 3" for 3 restarts. See "man cmrunserv".
Note. "xload" lets you see the hostname on which node it's running, "xclock" doesn't.
Note. edit "CUSTOMER DEFINED FUNCTIONS" if only you need pre and post service execution scripts.
Copy the package to the other nodes, verify and apply the package config,
scp -rp package1 sg2:/root/conf
cmcheckconf -P package1/package1.conf
cmapplyconf -P package1/package1.conf
Note. "-v" for mode details
Note. sometimes while reconfiguring the package, you need to delete if first,
#cmdeleteconf -p package1
Autostart the package,
cmmodpkg -e package1
cmviewcl
Note. "cmmodpkg -d" to disable autostart/failover
Note. to start the package manually on e.g. "sg2",
cmrunpkg -n sg2 package1
In you experience any errors, see the logs,
tail /var/log/messages
tail /root/conf/package1/package1.sh.log
Switch from sg1 to sg2,
cmhaltpkg package1 && cmrunpkg -n sg2 package1
cmmodpkg -e package1
cmviewcl
Update failover configuration
- update and copy the package's failover configuration (NODE_NAME). Order is important, first defines the default node.
- check the failover configuration,
grep NODE_NAME pkg1/pkg1.conf
grep NODE_NAME pkg2/pkg2.conf
- apply,
cmapplyconf -P pkg1/pkg1.conf
cmapplyconf -P pkg2/pkg2.conf
- verify,
cmgetconf -p pkg1 | grep ^NODE_NAME
cmgetconf -p pkg2 | grep ^NODE_NAME
cmgetconf -p pkg1 | grep ^SERVICE_NAME
cmgetconf -p pkg2 | grep ^SERVICE_NAME
cmviewcl -v | less
Add a new package
- create package's configuration and script
- define NODE_NAME order and SERVICE_NAME
- fix the package scripts,
IP[0]=10.1.1.23
SERVICE_NAME[0]="loloservice"
- check the package's configuration and script,
grep NODE_NAME pkg3/pkg3.conf
grep SERVICE_NAME pkg3/pkg3.conf
grep ^IP pkg3/pkg3.sh
grep ^SERVICE_NAME pkg3/pkg3.sh
- apply,
cmapplyconf -P pkg3/pkg3.conf
- verify,
cmgetconf -p pkg3 | grep ^NODE_NAME
cmgetconf -p pkg3 | grep ^SERVICE_NAME
cmviewcl -v | less
If you have any issues running the new package, watch the logs,
tail /var/log/messages
tail pkg3/pkg3.sh.log
HA scenarios
Here's what happens in different scenarios :
- If the application crashes or if you close it, it restarts depending on "-R" or e.g. "-r 3" see above.
- If you shutdown a node which owns a packages, it's switched to some other node immediately.
- If you disconnect all STATIONARY and HEARTBEAT interfaces, it will try to reform the cluster for a while (2 or 3 minutes, probably corresponding to NODE_TIMEOUT) and finally start the package on some other node.
- If you disconnect only the STATIONARY interface, nothing happens. The service is off the network and the cluster doesn't notice it.