Install and configure High Availability Linux Cluster with Pacemaker on CentOS 7.6
Table of Contents
Introduction to Linux Cluster with Pacemaker
Linux Cluster with Pacemaker is one of the common clusters we can set up on Linux servers. The pacemaker was available for both RPM-based and Debian based operating system.
The Pacemaker is a high-availability cluster resource manager it will run on all the hosts which we suppose to use in the cluster to make sure our services up and running to reduce the downtime. Pacemaker supports following node redundancy configuration Active/Active, Active/Passive, N+1, N+M, N-to-1 and N-to-N. The maximum numbers of nodes accepted in a cluster are 16.
As common in every cluster, the underlying operating system distribution and version should be the same for all the nodes. We are going to use CentOS 7.6 in our setup. Moreover, the hardware specification should match as well.
Perform a Minimal OS installation to start with setting up the Cluster. Follow below guide to setup your minimal installation on all the nodes planned to set up as a cluster.
Cluster setup is very sensible and needs proper time sync. While following above guide for minimal OS installation make sure to use a proper time/date and a static IP with network configuration and disk setup.
In our setup, we are about to use with below hostnames and IP information for all the node’s in our cluster.
In our first guide, we will use only two nodes, later we will use the remaining two nodes to add as an additional node to the cluster.
All the below steps need to be carried on each node except “Configure CoroSync” which need to be carried out only on node1.
If you need to skip network, NTP and do as part of post-installation skip the graphical demonstration. Below you will find the commands to configure Host-name, Interface and NTP. But, make sure to never skip the disk partitioning.
We need to configure a static IP to make the cluster stable by eliminating IP assignment from DHCP servers. Because DHCP’s periodic address renewal will interfere with corosync. To sync the date/time first we need to complete the network configuration.
To download the packages from the Internet make sure to reach the gateway.
Type the hostname in the designated area and click Apply to make the changes.
Configure Timezone and NTP server
Choose your timezone where your server resides.
To configure the NTP server, click on the Gear icon and add the timeserver or use the default existing ones.
The working servers can be identified by there status in green.
Partitioning the Disk
The partitioning should be defined as small in size for /, /home and swap. Remaining size can be left for future use under the volume group.
Select the filesystem type as XFS and device type as LVM.
Click on Modify under Volume Group. You will get the above window, choose the Size policy “As large as possible” to leave the remaining space under Volume Group.
Remaining steps are the same as installing a minimal Operating system.
Set the System Locale
If your setup with a minimal installation is required to set the C Type locale language to en_US.utf8.
# localectl set-locale LC_CTYPE=en_US.utf8
Print the status to verify the same.
[root@corcls1 ~]# localectl status
System Locale: LC_CTYPE=en_US.utf8
VC Keymap: us
X11 Layout: us
Assigning Static IP Address from CLI
If you have skipped the network settings during OS installation later we can configure the same by running nmcli command to configure the static IP.
[root@corcls1 ~]# timedatectl status
Local time: Sat 2019-08-03 17:02:49 +04
Universal time: Sat 2019-08-03 13:02:49 UTC
RTC time: Sat 2019-08-03 13:02:49
Time zone: Asia/Dubai (+04, +0400)
NTP enabled: yes
NTP synchronized: yes
RTC in local TZ: no
DST active: n/a
Verify the sync status, Use -v to get more informative output.
To perform admin tasks, running privileged commands or to copy any files set-up an SSH passwordless authentication by generation an SSH key.
[root@corcls1 ~]# ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Created directory '/root/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
The key's randomart image is:
|+Eo .***.o… |
|oo+o .o+ * + |
|o…. .B .o . |
|+.. +oB. |
|+. ooSoo |
|o *+o |
| . .o+. |
| .o |
Copy the generated SSH key to all the nodes.
# ssh-copy-id root@corcls2
[root@corcls1 ~]# ssh-copy-id root@corcls2
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub"
The authenticity of host 'corcls2 (192.168.107.201)' can't be established.
ECDSA key fingerprint is SHA256:Q6D+CZ+PH9PEmUIJwOkJeWBz91z273zwXEBPjk81mX0.
ECDSA key fingerprint is MD5:a3:35:63:21:01:ae:df:3e:6d:b3:6b:79:d9:0d:ff:a8.
Are you sure you want to continue connecting (yes/no)? yes
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
Number of key(s) added: 1
Now try logging into the machine, with: "ssh 'root@corcls2'"
and check to make sure that only the key(s) you wanted were added.
Verify the passwordless authentication by login into all the nodes.
[root@corcls1 ~]# ssh root@corcls2
Last login: Fri Aug 2 12:14:21 2019 from 192.168.107.1
[root@corcls2 ~]# exit
Connection to corcls2 closed.
Allow Cluster services through Firewall
Enable the required ports by enabling High-Availability firewall service.
Below is the firewalld service available to enable the service.
[root@corcls1 ~]# cat /usr/lib/firewalld/services/high-availability.xml
<?xml version="1.0" encoding="utf-8"?>
<short>Red Hat High Availability</short>
<description>This allows you to use the Red Hat High Availability (previously named Red Hat Cluster Suite). Ports are opened for corosync, pcsd, pacemaker_remote, dlm and corosync-qnetd.>/description>
<port protocol="tcp" port="2224"/>
<port protocol="tcp" port="3121"/>
<port protocol="tcp" port="5403"/>
<port protocol="udp" port="5404"/>
<port protocol="udp" port="5405"/>
<port protocol="tcp" port="9929"/>
<port protocol="udp" port="9929"/>
<port protocol="tcp" port="21064"/>
By running below command it will enable all ports. Reload the firewalld service to make the changes.
Verify the Quorum and voting status with anyone of below command.
# pcs status quorum
[root@corcls1 ~]# pcs status quorum
Date: Fri Aug 2 13:53:31 2019
Quorum provider: corosync_votequorum
Node ID: 1
Ring ID: 1/8
Expected votes: 2
Highest expected: 2
Total votes: 2
Flags: 2Node Quorate WaitForAll
Nodeid Votes Qdevice Name
1 1 NR corcls1 (local)
2 1 NR corcls2
Check the status of CoroSync
CoroSync is the cluster engine which provides services like membership, messaging and quorum.
# pcs status corosync
[root@corcls1 ~]# pcs status corosync
Nodeid Votes Name
1 1 corcls1 (local)
2 1 corcls2
Verify the CoroSync & CIB Configuration
It’s better to know the corosync and CIB configuration files.
The CIB file or Cluster information base will be saved in an XML format which will take care of all nodes and resources state. The CIB will be synchronized across the cluster and handles requests to modify it.
To view the cluster information base use option cib with pcs command.
You can notice the configuration with node information, cluster name and much more.
Logs to looks for
The log file we need to look for anything related to cluster service.
# tail -f /var/log/cluster/corosync.log
[root@corcls1 ~]# tail -n 10 /var/log/cluster/corosync.log
Aug 03 14:01:49  corcls1.linuxsysadmins.local cib: info: cib_file_backup: Archived previous version as /var/lib/pacemaker/cib/cib-10.raw
Aug 03 14:01:49  corcls1.linuxsysadmins.local cib: info: cib_file_write_with_digest: Wrote version 0.7.0 of the CIB to disk (digest: b1e78c0e1364bb94dec0fefdd2ff1bd1)
Aug 03 14:01:49  corcls1.linuxsysadmins.local cib: info: cib_file_write_with_digest: Reading cluster configuration file /var/lib/pacemaker/cib/cib.PeKvG9 (digest: /var/lib/pacemaker/cib/cib.GkgYGD)
Aug 03 14:01:54  corcls1.linuxsysadmins.local cib: info: cib_process_ping: Reporting our current digest to corcls2: 2e36d8d0181912ebe6a1f058cb613057 for 0.7.4 (0x55c951db95f0 0)
Aug 03 14:01:58  corcls1.linuxsysadmins.local crmd: info: crm_procfs_pid_of: Found cib active as process 1406
Aug 03 14:01:58  corcls1.linuxsysadmins.local crmd: notice: throttle_check_thresholds: High CPU load detected: 1.390000
Aug 03 14:01:58  corcls1.linuxsysadmins.local crmd: info: throttle_send_command: New throttle mode: 0100 (was ffffffff)
Aug 03 14:02:28  corcls1.linuxsysadmins.local crmd: info: throttle_check_thresholds: Moderate CPU load detected: 0.920000
Aug 03 14:02:28  corcls1.linuxsysadmins.local crmd: info: throttle_send_command: New throttle mode: 0010 (was 0100)
Aug 03 14:02:58  corcls1.linuxsysadmins.local crmd: info: throttle_send_command: New throttle mode: 0000 (was 0010)
That’s it we have completed with the basic pacemaker cluster setup.
In our next guide let see how to manage the cluster from the GUI.
The basic pacemaker Linux cluster setup will provide high availability for any services configured to use with it. Let see how to create a resource, about fencing and much more on upcoming articles. Subscribe to our newsletter and stay with us to receive the updates. Your feedbacks are most welcome in below comment section.