Guide: Kubernetes Multi-Master HA Cluster with kubeadm

Development Sep 22, 2020

Hello everybody, Tansanrao here! This post is going to guide you into setting up a Multi-Master HA (High-Availability) Kubernetes Cluster on bare-metal or virtual machines.

All our VM images will be based on Ubuntu 20.04.1 Server and for the purpose of this guide, will be Virtual Machines on a VMware ESXi host.

We will require 7 Virtual Machines with a minimum spec of 2 Cores and 4GB RAM per Node for decent performance. Also make sure that you have Static IPs assigned on your DHCP Server.

We are using the following Hostnames & IP Assignments:

  • 1 HAProxy Load Balancer Node
    — k8s-haproxy : 192.168.1.112
  • 3 Etcd/Kubernetes Master Nodes
    — k8s-master-a : 192.168.1.113
    — k8s-master-b : 192.168.1.114
    — k8s-master-c : 192.168.1.115
  • 3 Kubernetes Worker Nodes
    — k8s-node-a : 192.168.1.116
    — k8s-node-b : 192.168.1.117
    — k8s-node-c : 192.168.1.118

We will also require 1 linux client machine, if unavailable, the client tools may be installed on the HAProxy node.

The minimum for production use is 2 physical hosts with at least 1 Master on each with the recommended being 3 hosts with 1 Master and 1 Worker Node Each with an external load balancer. For the sake of this guide, I am running all 7 nodes on the same ESXi host. A single host should be safe enough to use for lab and test environments but do not run anything mission critical on it.

Let’s get started!

Prepare Virtual Machines / Servers

Start by preparing 7 machines with Ubuntu 20.04.1 Server using the correct hostnames and IP addresses. Once done, power on all of them and apply the latest updates using:

sudo apt update && sudo apt upgrade

Setting up Client Tools

Installing cfssl.

CFSSL is an SSL tool by Cloudflare which lets us create our Certs and CAs.

Step 1 - Download the binaries

wget https://pkg.cfssl.org/R1.2/cfssl_linux-amd64
wget https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64

Step 2 - Add the execution permission to the binaries

chmod +x cfssl*

Step 3 - Move the binaries to /usr/local/bin

sudo mv cfssl_linux-amd64 /usr/local/bin/cfssl
sudo mv cfssljson_linux-amd64 /usr/local/bin/cfssljson

Step 4 - Verify the installation

cfssl version

Installing kubectl

Step 1 - Get the binary

make sure it’s the same version as the cluster, in our case we are using v1.19

curl -LO https://storage.googleapis.com/kubernetes-release/release/v1.19.0/bin/linux/amd64/kubectl

Step 2 - Add the execution permission to the binary

chmod +x kubectl

Step 3 - Move the binary to /usr/local/bin

sudo mv kubectl /usr/local/bin

Step 4 - Verify the installation

kubectl version

Installing HAProxy Load Balancer

As we will be deploying three Kubernetes master nodes, we need to deploy an HAProxy Load Balancer in front of them to distribute the traffic.

Step 1 - SSH to the HAProxy VM

ssh [email protected]

Step 2 - Install HAProxy

sudo apt-get install haproxy

Step 3 - Configure HAProxy

sudo nano /etc/haproxy/haproxy.cfg

Enter the following config:

global
...
default
...
frontend kubernetes
bind 192.168.1.112:6443
option tcplog
mode tcp
default_backend kubernetes-master-nodes


backend kubernetes-master-nodes
mode tcp
balance roundrobin
option tcp-check
server k8s-master-a 192.168.1.113:6443 check fall 3 rise 2
server k8s-master-b 192.168.1.114:6443 check fall 3 rise 2
server k8s-master-c 192.168.1.115:6443 check fall 3 rise 2

Step 4 - Restart HAProxy

sudo systemctl restart haproxy

Generating the TLS certificates

These steps can be done on your Linux client if you have one or on the HAProxy machine depending on where you installed the cfssl tool.

Creating a Certificate Authority

Step 1 - Create the certificate authority configuration file

nano ca-config.json

Enter the following config:

{
  "signing": {
    "default": {
      "expiry": "8760h"
    },
    "profiles": {
      "kubernetes": {
        "usages": ["signing", "key encipherment", "server auth", "client auth"],
        "expiry": "8760h"
      }
    }
  }
}

Step 2 - Create the certificate authority signing request configuration file

nano ca-csr.json

Enter the following config, Change the names as necessary:

{
  "CN": "Kubernetes",
  "key": {
    "algo": "rsa",
    "size": 2048
  },
  "names": [
  {
    "C": "IN",
    "L": "Belgaum",
    "O": "Tansanrao",
    "OU": "CA",
    "ST": "Karnataka"
  }
 ]
}

Step 3 - Generate the certificate authority certificate and private key

cfssl gencert -initca ca-csr.json | cfssljson -bare ca

Step 4 - Verify that the ca-key.pem and the ca.pem were generated

ls -la

Creating the certificate for the Etcd cluster

Step 1 - Create the certificate signing request configuration file

nano kubernetes-csr.json

Add the following config:

{
  "CN": "Kubernetes",
  "key": {
    "algo": "rsa",
    "size": 2048
  },
  "names": [
  {
    "C": "IN",
    "L": "Belgaum",
    "O": "Tansanrao",
    "OU": "CA",
    "ST": "Karnataka"
  }
 ]
}

Step 2 - Generate the certificate and private key

cfssl gencert \
-ca=ca.pem \
-ca-key=ca-key.pem \
-config=ca-config.json \
-hostname=192.168.1.112,192.168.1.113,192.168.1.114,192.168.1.115,127.0.0.1,kubernetes.default \
-profile=kubernetes kubernetes-csr.json | \
cfssljson -bare kubernetes

Step 3 - Verify that the kubernetes-key.pem and the kubernetes.pem file were generated.

ls -la

Step 4 - Copy the certificate to each nodes

scp ca.pem kubernetes.pem kubernetes-key.pem [email protected]:~
scp ca.pem kubernetes.pem kubernetes-key.pem [email protected]:~
scp ca.pem kubernetes.pem kubernetes-key.pem [email protected]:~
scp ca.pem kubernetes.pem kubernetes-key.pem [email protected]:~
scp ca.pem kubernetes.pem kubernetes-key.pem [email protected]:~
scp ca.pem kubernetes.pem kubernetes-key.pem [email protected]:~

Preparing the nodes for kubeadm

Initial Setup for all master and node machines

Copy the commands below and paste them into a setup.sh file and then execute it with . setup.sh.

This script will check for and uninstall older versions of docker and will replace it with the latest version of docker-ce for ubuntu 20.04. It will also add the kubernetes repository and install kubelet, kubeadm, kubectl and will also mark the packages to prevent auto updates.

sudo apt-get remove docker docker-engine docker.io containerd runc

sudo apt-get install -y \
    apt-transport-https \
    ca-certificates \
    curl \
    gnupg-agent \
    software-properties-common

curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -

sudo add-apt-repository \
   "deb [arch=amd64] https://download.docker.com/linux/ubuntu \
   $(lsb_release -cs) \
   stable"

sudo apt-get install -y docker-ce docker-ce-cli containerd.io

sudo usermod -aG docker tansanrao

curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -
cat <<EOF | sudo tee /etc/apt/sources.list.d/kubernetes.list
deb https://apt.kubernetes.io/ kubernetes-xenial main
EOF
sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl

sudo swapoff -a

Now we need to turn swap off for the nodes by editing /etc/fstab on each machine.

sudo nano /etc/fstab

Comment the line that starts with /swap or /swap.img. My /etc/fstab looks like this after making the change.

# /etc/fstab: static file system information.
#
# Use 'blkid' to print the universally unique identifier for a
# device; this may be used with UUID= as a more robust way to name devices
# that works even if disks are added and removed. See fstab(5).
#
# <file system> <mount point>   <type>  <options>       <dump>  <pass>
# / was on /dev/ubuntu-vg/ubuntu-lv during curtin installation
/dev/disk/by-id/dm-uuid-LVM-s96R5iaP77QRtKuZZ0mYLuJcarDuQldMUj3yYFLQDRKWOqz9PHtLTnMMl2cbxpkC / ext4 defaults 0 0
# /boot was on /dev/sda2 during curtin installation
/dev/disk/by-uuid/bcc851c2-bbc4-44c0-bb36-c142eedd63a6 /boot ext4 defaults 0 0
#/swap.img      none    swap    sw      0       0

Installing and configuring Etcd on all 3 Master Nodes

Step 1 - Download and move etcd files and certs to their respective places

sudo mkdir /etc/etcd /var/lib/etcd

sudo mv ~/ca.pem ~/kubernetes.pem ~/kubernetes-key.pem /etc/etcd

wget https://github.com/etcd-io/etcd/releases/download/v3.4.13/etcd-v3.4.13-linux-amd64.tar.gz

tar xvzf etcd-v3.4.13-linux-amd64.tar.gz

sudo mv etcd-v3.4.13-linux-amd64/etcd* /usr/local/bin/

Step 2 - Create an etcd systemd unit file

sudo nano /etc/systemd/system/etcd.service

Enter the following config:

[Unit]
Description=etcd
Documentation=https://github.com/coreos


[Service]
ExecStart=/usr/local/bin/etcd \
  --name 192.168.1.113 \
  --cert-file=/etc/etcd/kubernetes.pem \
  --key-file=/etc/etcd/kubernetes-key.pem \
  --peer-cert-file=/etc/etcd/kubernetes.pem \
  --peer-key-file=/etc/etcd/kubernetes-key.pem \
  --trusted-ca-file=/etc/etcd/ca.pem \
  --peer-trusted-ca-file=/etc/etcd/ca.pem \
  --peer-client-cert-auth \
  --client-cert-auth \
  --initial-advertise-peer-urls https://192.168.1.113:2380 \
  --listen-peer-urls https://192.168.1.113:2380 \
  --listen-client-urls https://192.168.1.113:2379,http://127.0.0.1:2379 \
  --advertise-client-urls https://192.168.1.113:2379 \
  --initial-cluster-token etcd-cluster-0 \
  --initial-cluster 192.168.1.113=https://192.168.1.113:2380,192.168.1.114=https://192.168.1.114:2380,192.168.1.115=https://192.168.1.115:2380 \
  --initial-cluster-state new \
  --data-dir=/var/lib/etcd
Restart=on-failure
RestartSec=5



[Install]
WantedBy=multi-user.target

Replace the IP address on all fields except the —initial-cluster field to match the machine IP.

Step 3 - Reload the daemon configuration.

sudo systemctl daemon-reload

Step 4 - Enable etcd to start at boot time.

sudo systemctl enable etcd

Step 5 - Start etcd.

sudo systemctl start etcd

Repeat the process for all 3 master nodes and then move to step 6.

Step 6 - Verify that the cluster is up and running.

ETCDCTL_API=3 etcdctl member list

It should give you an output similar to this:

73ea126859b3ba4, started, 192.168.1.114, https://192.168.1.114:2380, https://192.168.1.114:2379, false
a28911111213cc6c, started, 192.168.1.115, https://192.168.1.115:2380, https://192.168.1.115:2379, false
feadb5a763a32caa, started, 192.168.1.113, https://192.168.1.113:2380, https://192.168.1.113:2379, false

Initialising the Master Nodes

Initialising the first Master Node

Step 1 - SSH to the first Master Node

ssh [email protected]

Step 2 - Create the configuration file for kubeadm

nano config.yaml

Enter the following config:

apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
kubernetesVersion: v1.19.0
controlPlaneEndpoint: "192.168.1.112:6443"
etcd:
  external:
    endpoints:
      - https://192.168.1.113:2379
      - https://192.168.1.114:2379
      - https://192.168.1.115:2379
    caFile: /etc/etcd/ca.pem
    certFile: /etc/etcd/kubernetes.pem
    keyFile: /etc/etcd/kubernetes-key.pem
networking:
  podSubnet: 10.30.0.0/24
apiServer:
  certSANs:
    - "192.168.1.112"
  extraArgs:
    apiserver-count: "3"

Add any additional domains or IP Addresses that you would want to connect to the cluster under certSANs.

Step 3 - Initialise the machine as a master node

sudo kubeadm init --config=config.yaml

Step 4 - Copy the certificates to the two other masters

sudo scp -r /etc/kubernetes/pki [email protected]:~
sudo scp -r /etc/kubernetes/pki [email protected]:~

Initialising the second Master Node

Step 1 - SSH to the second Master Node

ssh [email protected]

Step 2 - Remove the apiserver.crt and apiserver.key

rm ~/pki/apiserver.*

Step 3 - Move the certificates to the /etc/kubernetes directory.

sudo mv ~/pki /etc/kubernetes/

Step 4 - Create the configuration file for kubeadm

nano config.yaml

Enter the following config:

apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
kubernetesVersion: v1.19.0
controlPlaneEndpoint: "192.168.1.112:6443"
etcd:
  external:
    endpoints:
      - https://192.168.1.113:2379
      - https://192.168.1.114:2379
      - https://192.168.1.115:2379
    caFile: /etc/etcd/ca.pem
    certFile: /etc/etcd/kubernetes.pem
    keyFile: /etc/etcd/kubernetes-key.pem
networking:
  podSubnet: 10.30.0.0/24
apiServer:
  certSANs:
    - "192.168.1.112"
  extraArgs:
    apiserver-count: "3"

Step 5 - Initialise the machine as a master node.

sudo kubeadm init --config=config.yaml

Initialising the third master node

Step 1 - SSH to the third Master Node

ssh [email protected]

Step 2 - Remove the apiserver.crt and apiserver.key

rm ~/pki/apiserver.*

Step 3 - Move the certificates to the /etc/kubernetes directory.

sudo mv ~/pki /etc/kubernetes/

Step 4 - Create the configuration file for kubeadm

nano config.yaml

Enter the following config:

apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
kubernetesVersion: v1.19.0
controlPlaneEndpoint: "192.168.1.112:6443"
etcd:
  external:
    endpoints:
      - https://192.168.1.113:2379
      - https://192.168.1.114:2379
      - https://192.168.1.115:2379
    caFile: /etc/etcd/ca.pem
    certFile: /etc/etcd/kubernetes.pem
    keyFile: /etc/etcd/kubernetes-key.pem
networking:
  podSubnet: 10.30.0.0/24
apiServer:
  certSANs:
    - "192.168.1.112"
  extraArgs:
    apiserver-count: "3"

Step 5 - Initialise the machine as a master node.

sudo kubeadm init --config=config.yaml

Step 6 - Save the join command printed in the output after the above command

Example Output:

kubeadm join 192.168.1.112:6443 --token c5tkdt.47tjw72synw7qbn9 \
    --discovery-token-ca-cert-hash sha256:069081b1116e821958da62e8d1c185b1df94849bdeb414761e992585f4034ce8 

NOTE: use the output from your terminal and not this post.

Configure kubectl on the client machine

Step 1 - SSH to one of the master nodes

ssh [email protected]

Step 2 - Add permissions to the admin.conf file

sudo chmod +r /etc/kubernetes/admin.conf

Step 3 - From the client machine, copy the configuration file.

scp [email protected]:/etc/kubernetes/admin.conf .

Step 4 - Create and configure the kubectl configuration directory.

mkdir ~/.kube
mv admin.conf ~/.kube/config
chmod 600 ~/.kube/config

Step 5 - Go back to the SSH session and revert the permissions of the config file

sudo chmod 600 /etc/kubernetes/admin.conf

Step 6 - Test to see if you can access the Kubernetes API from the client machine

kubectl get nodes

Expected Output:

NAME           STATUS     ROLES    AGE     VERSION
k8s-master-a   NotReady   master   44m     v1.19.2
k8s-master-b   NotReady   master   11m     v1.19.2
k8s-master-c   NotReady   master   5m50s   v1.19.2

Initialise the worker nodes

SSH into each worker node and execute the kubeadm join command that you copied previously.

sudo kubeadm join 192.168.1.112:6443 --token c5tkdt.47tjw72synw7qbn9 \
    --discovery-token-ca-cert-hash sha256:069081b1116e821958da62e8d1c185b1df94849bdeb414761e992585f4034ce8 

Once all three worker nodes have joined the cluster, test the API to check the available nodes from the client machine.

kubectl get nodes

Expected Output:

NAME           STATUS     ROLES    AGE   VERSION
k8s-master-a   NotReady   master   53m   v1.19.2
k8s-master-b   NotReady   master   20m   v1.19.2
k8s-master-c   NotReady   master   14m   v1.19.2
k8s-node-a     NotReady   <none>   26s   v1.19.2
k8s-node-b     NotReady   <none>   19s   v1.19.2
k8s-node-c     NotReady   <none>   18s   v1.19.2

Deploying the overlay network

We will be using Project Calico as the overlay network but you are free to use any other alternatives such as Flannel or WeaveNet

Apply the manifest to deploy calico overlay

curl https://docs.projectcalico.org/manifests/calico.yaml -O
kubectl apply -f calico.yaml

Check that all the pods deployed correctly

kubectl get pods -n kube-system

Congratulations! Your Bare-metal HA Cluster is ready for use. I recommend setting up Rancher Server for managing it and to setup Traefik as Ingress Controller, Longhorn as a Persistent Volume Provider, Prometheus & Grafana for Metrics and EFK Stack for Logging and Distributed Tracing. Guides for the same are in the works and will be posted in the coming weeks.

For any doubts, suggestions or issues, leave a comment below and I will get back to you asap! Follow me on Twitter & Instagram for behind the scenes and updates.

Tags

Tanuj Ravi Rao

99% of the time my brain is thinking blah, meh, why, huh, WTF, food and computers. The other 1% i’m usually asleep.

Great! You've successfully subscribed.
Great! Next, complete checkout for full access.
Welcome back! You've successfully signed in.
Success! Your account is fully activated, you now have access to all content.