Linux High availability quick setup

linux-haConsider two Linux machines:

Active machine heart beat IP: 192.168.1.1
Passive machine heart beat IP: 192.168.1.2

Virtual/Floating IP 192.168.2.11

Install heartbeat software on active and passive machines

manish@active-mc:~$ aptitude install heartbeat
manish@passive-mc:~$ aptitude install heartbeat

Get node names of both machines


manish@active-mc:~$ uname -n
active-mc
manish@passive-mc:~$ uname -n
passive-mc

On Active host do following

Configure ha.cf file


root@active-mc:/etc/ha.d# cat /etc/ha.d/ha.cf
# path of HA log file
logfile /var/log/ha-log
auto_failback on
# heartbeat signals port
udpport 694
node active-mc
# heartbeat network interface on primary
ucast eth1 192.168.1.1
node passive-mc
# heartbeat network interface on primary
ucast eth1 192.168.1.2

Configure resources to be covered under HA in haresources file


root@active-mc:/etc/ha.d# cat /etc/ha.d/haresources
#perferrered_machine Virtual IP service1 service2...
active 192.168.2.11 syslog-ng

remember that heartbeat looks for the service startup scripts in following directories and expects “start” argument to the startup scripts

/etc/ha.d/resource.d
/etc/init.d

If you want to use customize arguments to the startup script then you can write in it following format


active 192.168.2.11 myservice::myarg1

Add authentication key in authkeys file

root@active-mc:/etc/ha.d# cat /etc/ha.d/authkeys
auth1
1 sha1 a617248d47bce143ee169d17ef6298d9

Use a random hard to guess key. You can use simple tools like md5sum to generate key

tail -1000 /var/log/messages | md5sum
a617248d47bce143ee169d17ef6298d9

Copy ha.cf, haresources, authkeys files to Passive machine in /etc/ha.d/ directory

Start heartbeat service on both machines

/etc/init.d/heartbeat start

monitor /var/log/ha-log file on both machines.

To verify heartbeat function check if virtual IP is assigned on a network interface of primary host. In case you see it assigned on both active and passive machines then your heartbeat is not functioning properly. To troubleshoot the problem analyse traffic on network interface on both nodes configured in ha.cf.

tcpdump -i eth1 -vv -n port 694

Also verify that the services configured in HA resource are running on primary machine.