Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

K3S-Deploy - Suggestions for enhancement #68

Open
nmehran opened this issue Mar 8, 2024 · 2 comments
Open

K3S-Deploy - Suggestions for enhancement #68

nmehran opened this issue Mar 8, 2024 · 2 comments

Comments

@nmehran
Copy link

nmehran commented Mar 8, 2024

Below are some simple enhancements which will improve the robustness of the k3s.sh script.

End script immediately on error

Note: This will provide a cleaner log as to what caused the error.

Insert at the top of the script:

set -e  # Exit immediately if a command exits with a non-zero status.

Update and upgrade system packages, with lock mitigation support.

Add:

# Update and upgrade system packages, with lock mitigation support
attempt_limit=10
attempt_delay_seconds=3
for ((attempt=1; attempt<=attempt_limit; attempt++)); do
    if sudo apt-get update && sudo apt-get upgrade -y; then
        echo "Package list updated and packages upgraded successfully."
        break # Success
    elif ((attempt == attempt_limit)); then
        echo "Failed to update and upgrade packages within $attempt_limit attempts."
        exit 1 # Failure after all attempts
    else
        echo "Attempt $attempt of $attempt_limit failed. Retrying in $attempt_delay_seconds seconds..."
        sleep $attempt_delay_seconds
    fi
done

Synchronize node NTPs to ensure time synchronization on nodes

Note: k3sup and other downloads may fail if time is not synchronized between VM snapshots, so this is important.

Insert:

# Install policycoreutils for each node
for newnode in "${all[@]}"; do
  ssh $user@$newnode -i ~/.ssh/$certName sudo su <<EOF
  sudo timedatectl set-ntp off  # ***** This has been inserted *****
  sudo timedatectl set-ntp on  # ***** This has been inserted *****
  NEEDRESTART_MODE=a apt install policycoreutils -y
  exit
EOF
  echo -e " \033[32;5mPolicyCoreUtils installed!\033[0m"
done

Add robust wait on "Install Metallb"

Append:

# Step 8: Install Metallb
kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.12.1/manifests/namespace.yaml
kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.13.12/config/manifests/metallb-native.yaml
kubectl wait --for=condition=ready pod -l app=metallb --namespace=metallb-system --timeout=300s   # ***** This has been appended *****
@JamesTurland
Copy link
Owner

Thank you, I will test all of these when I can.

@nmehran
Copy link
Author

nmehran commented Apr 9, 2024

After some further testing, your original time synchronization method seems to be more robust!

sudo timedatectl set-ntp off
sudo timedatectl set-ntp on

I think synchronizing the time for each node in the for newnode in "${all[@]}"; do loop is the most important improvement we could make, because in my tests, the nodes were failing to install k3sup dependencies without it.

I've gone ahead and edited the above post to reflect these changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants