This section provisions a fresh Kubernetes cluster using Kubeadm. These steps should work fine on both Debian and RedHat based distros. This configuration assumes you already have an HA-API configured, as discussed in the previous section.
- Enable NTP:
systemctl enable --now systemd-timesyncd
- Disable IPv6 if you're not using it. It just makes things easier to toubleshoot in my opinion.
Append the following lines to/etc/sysctl.d/k8s.conf
net.ipv6.conf.all.disable_ipv6 = 1 net.ipv6.conf.default.disable_ipv6 = 1
- Bump inotify limits.
Append the following lines to/etc/sysctl.d/k8s.conf
fs.inotify.max_user_instances=1024
- Enable Secure Boot.
- Verify secure boot is enabled:
mokutil --sb-state
- Verify lockdown is integrity:
cat /sys/kernel/security/lockdown
- Verify secure boot is enabled:
- Set journald max size with
sed -i 's/#SystemMaxUse=/SystemMaxUse=1G/' /etc/systemd/journald.conf
- Network redundancy (note that bond-mode 4 is LACP and requires switch config as well)
#apt install ifenslave #cat /etc/network/interfaces auto eno1 iface eno1 inet manual bond-master bond0 bond-mode 4 auto eno2 iface eno2 inet manual bond-master bond0 bond-mode 4 auto bond0 iface bond0 inet static bond-slaves eno1 eno2 bond-mode 4 address ADDRESS/MASK gateway GATEWAY dns-nameservers DNS_SERVERS dns-search DNS_SEARCH_DOMAINS
- Disable swap
- Remove swap references from /etc/fstab
- Reboot, or deactive the active swap with
swapoff -a
- Install iptables/nftables and enable it to start on boot
systemctl enable nftables --now
As the Dockershim CRI is now deprecated, containerd is a good choice to use.
- Add the Docker repo (provides the containerd packages).
- Install
containerd.io
and enable it to start on boot. - Cgroups Config:
- Kubernetes Cgroup Driver: As of 1.21, Kubernetes uses the
systemd
cgroup driver by default, but we'll specify it inprovision.yaml
as well. - Systemd Cgroup Version: As of Debian 11, systemd defaults to using control groups v2.
- Containerd Cgroup Version: The default value of
runtime type
isio.containerd.runc.v2
, which means cgroups v2. - Containerd Cgroup Driver: Set containerd to use the
SystemdCgroup
driver.containerd config default > /etc/containerd/config.toml
Starting with containerd 1.5, the cgroup driver and version can be verified as follows. A bug in versions < 1.5 produces the wrong output. The[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options] SystemdCgroup = true
crictl
command will be available after installing the Kubernetes packages.# crictl -r unix:///run/containerd/containerd.sock info | grep runtimes -A 29 "runtimes": { "runc": { "runtimeType": "io.containerd.runc.v2", "runtimePath": "", "runtimeEngine": "", "PodAnnotations": [], "ContainerAnnotations": [], "runtimeRoot": "", "options": { "BinaryName": "", "CriuImagePath": "", "CriuPath": "", "CriuWorkPath": "", "IoGid": 0, "IoUid": 0, "NoNewKeyring": false, "NoPivotRoot": false, "Root": "", "ShimCgroup": "", "SystemdCgroup": true }, "privileged_without_host_devices": false, "privileged_without_host_devices_all_devices_allowed": false, "baseRuntimeSpec": "", "cniConfDir": "", "cniMaxConfNum": 0, "snapshotter": "", "sandboxMode": "podsandbox" } },
- Kubernetes Cgroup Driver: As of 1.21, Kubernetes uses the
Add the following modules to a conf file in /etc/modules-load.d
. Ex: /etc/modules-load.d/k8.conf
overlay
br_netfilter
Append the following lines to /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-ip6tables = 1
Load the new paramters with sysctl --system
Add the Kubernetes repo and install the following packages.
kubelet
: The component that runs on all of the nodes in the cluster and does things like starting pods and containers.kubeadm
: The command to bootstrap the cluster.kubectl
: The command line util to talk to your cluster.
- Remove SSH keys.
rm /etc/ssh/ssh_host*
- Shutdown system and clone. In vSphere environments, this VM could be converted into a template. This same image can be used for both the controllers and workers.
- On Debian distros, you'll need to regenerate SSH keys manually
dpkg-reconfigure openssh-server
- Change the IPs and hostnames of the clones
- Verify
/sys/class/dmi/id/product_uuid
is uniqe on every host
- Provision the cluster on a to-be master
kubeadm init --config provision.yaml --upload-certs
- Copy the kubeconfig to the correct user account
- Install a network addon, paying attention to Network Policy support. Calico is a good option:
- Install the Operator:
kubectl create -f https://projectcalico.docs.tigera.io/manifests/tigera-operator.yaml
- Download the custom resources:
curl https://projectcalico.docs.tigera.io/manifests/custom-resources.yaml -O
- Customize if necessary
- Create the manifest:
kubectl create -f custom-resources.yaml
- Install calicoctl
- When it comes time to upgrade Calico, instructions can be found here.
- Install the Operator:
- Approve the kubelet CSRs for the new nodes.
kubectl get csr
kubectl certificate approve <name>
kubectl get nodes
should now show the new master node asReady
NAME STATUS ROLES AGE VERSION node-01 Ready control-plane,master 1h v1.20.5
- Join the other master nodes
- On the already-running master
- Reupload control plane certs and print the decryption key to retrieve them on the other master nodes.
kubeadm init phase upload-certs --upload-certs
- Print the join command to use on the other master nodes.
kubeadm token create --print-join-command
- Reupload control plane certs and print the decryption key to retrieve them on the other master nodes.
- Paste the join command with
--control-plane --certificate-key xxxx
appended, on each to-be master - Approve the CSRs for the new master nodes.
- On the already-running master
- Join the other worker nodes
kubeadm token create --print-join-command
- Approve the CSRs for the new worker nodes.
- Verify
kubectl get nodes
should now show all nodes asReady
NAME STATUS ROLES AGE VERSION node-01 Ready control-plane,master 1h v1.20.5 node-02 Ready control-plane,master 1h v1.20.5 node-03 Ready control-plane,master 1h v1.20.5 node-04 Ready <none> 1h v1.20.5 node-05 Ready <none> 1h v1.20.5 node-06 Ready <none> 1h v1.20.5
kubectl get pods --all-namespaces
should show all pods asRunning
Run the Sonobuoy conformance test
- NOTE: If this exits within a couple minutes, it most likely timed out connecting to the API or looking up a name in CoreDNS.
- Start the tests. They take awhile:
sonobuoy run --wait
- Watch the logs in another window:
kubectl logs sonobuoy --namespace sonobuoy -f
- Get the results:
results=$(sonobuoy retrieve)
- View the results:
sonobuoy results $results
- Delete the tests:
sonobuoy delete --wait
Run the kube-bench security conformance tests
kubectl apply -f https://raw.githubusercontent.com/aquasecurity/kube-bench/main/job.yaml
- Wait until
kubectl get pods | grep kube-bench
showsCompleted
kubectl logs kube-bench-xxxxx | less
kubectl delete job kube-bench