Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EKS Auto mode cannot be disabled #5105

Open
flostadler opened this issue Jan 14, 2025 · 3 comments
Open

EKS Auto mode cannot be disabled #5105

flostadler opened this issue Jan 14, 2025 · 3 comments
Labels
awaiting-upstream The issue cannot be resolved without action in another repository (may be owned by Pulumi). blocked The issue cannot be resolved without 3rd party action. kind/bug Some behavior is incorrect or out of spec

Comments

@flostadler
Copy link
Contributor

Describe what happened

Upstream has an issue that prevents disabling EKS Auto Mode without replacing a cluster: hashicorp/terraform-provider-aws#40582

Disabling it fails with:

Diagnostics:
  pulumi:pulumi:Stack (brainfish-universe-eks-au):
    error: eks:index:Cluster resource 'brainfish-au' has a problem: grpc: the client connection is closing

  aws:eks:Cluster (brainfish-au-eksCluster):
    error:   sdk-v2/provider2.go:515: sdk.helper_schema: compute_config.enabled, kubernetes_networking_config.elastic_load_balancing.enabled, and storage_config.block_storage.enabled must all be set to either true or false: [email protected]
    error: diffing urn:pulumi:au::brainfish-universe-eks::eks:index:Cluster$aws:eks/cluster:Cluster::brainfish-au-eksCluster: 1 error occurred:
        * compute_config.enabled, kubernetes_networking_config.elastic_load_balancing.enabled, and storage_config.block_storage.enabled must all be set to either true or false

To work around this I'd recommend to disable auto mode manually (AWS CLI or Console) and then run pulumi refresh.

Sample program

Run pulumi up with the following program and then remove the autoMode block before running pulumi up again.

import * as awsx from "@pulumi/awsx";
import * as eks from "@pulumi/eks";

const eksVpc = new awsx.ec2.Vpc("eks-vpc", {
    enableDnsHostnames: true,
});

// Create the EKS cluster
const eksCluster = new eks.Cluster("eks-cluster", {
    vpcId: eksVpc.vpcId,
    authenticationMode: eks.AuthenticationMode.Api,
    publicSubnetIds: eksVpc.publicSubnetIds,
    privateSubnetIds: eksVpc.privateSubnetIds,
    skipDefaultNodeGroup: true,
    skipDefaultSecurityGroups: true,
    // set autoMode.enabled to `false` or remove the automode block on the next up
    autoMode: {
        enabled: false
    }
});

Log output

n/a

Affected Resource(s)

  • aws.eks.Cluster

Output of pulumi about

n/a

Additional context

No response

Contributing

Vote on this issue by adding a 👍 reaction.
To contribute a fix for this issue, leave a comment (and link to your pull request, if you've opened one already).

@flostadler flostadler added kind/bug Some behavior is incorrect or out of spec needs-triage Needs attention from the triage team blocked The issue cannot be resolved without 3rd party action. awaiting-upstream The issue cannot be resolved without action in another repository (may be owned by Pulumi). and removed needs-triage Needs attention from the triage team labels Jan 14, 2025
@flostadler
Copy link
Contributor Author

The scope of this is even broader. Preview is broken if the computeConfig block is unknown/computed. Repro:

import * as pulumi from "@pulumi/pulumi";
import * as awsx from "@pulumi/awsx";
import * as eks from "@pulumi/eks";
import * as aws from "@pulumi/aws";

// Grab some values from the Pulumi configuration (or use default values)
const config = new pulumi.Config();
const vpcNetworkCidr = config.get("vpcNetworkCidr") || "10.0.0.0/16";

// Create a new VPC
const eksVpc = new awsx.ec2.Vpc("eks-vpc", {
    enableDnsHostnames: true,
    cidrBlock: vpcNetworkCidr,
    numberOfAvailabilityZones: 2,
    subnetStrategy: "Auto",
});

const nodeRole = new aws.iam.Role("eks-node-role", {
    assumeRolePolicy: aws.iam.getPolicyDocumentOutput({
        version: "2012-10-17",
        statements: [{
            effect: "Allow",
            principals: [{
                type: "Service",
                identifiers: ["ec2.amazonaws.com"]
            }],
            actions: ["sts:AssumeRole", "sts:TagSession"]
        }]
    }).json
});

const attachments = [
    new aws.iam.RolePolicyAttachment("eks-node-role-policy-worker-node-minimal", {
        role: nodeRole,
        policyArn: "arn:aws:iam::aws:policy/AmazonEKSWorkerNodeMinimalPolicy",
    }),
    new aws.iam.RolePolicyAttachment("eks-node-role-policy-ecr-pull", {
        role: nodeRole,
        policyArn: "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryPullOnly",
    }),
];

// Create the EKS cluster
const eksCluster = new eks.Cluster("eks-cluster", {
    vpcId: eksVpc.vpcId,
    authenticationMode: eks.AuthenticationMode.Api,
    publicSubnetIds: eksVpc.publicSubnetIds,
    privateSubnetIds: eksVpc.privateSubnetIds,
    autoMode: {
        enabled: true,
        createNodeRole: false,
        computeConfig: {
            nodeRoleArn: nodeRole.arn,
        }
    }
}, { dependsOn: [...attachments] });

// Export some values for use elsewhere
export const kubeconfig = eksCluster.kubeconfig;
export const clusterName = eksCluster.eksCluster.name;

This seems to cause computeConfig.enabled to be wrongly assumed to be false.

@flostadler
Copy link
Contributor Author

flostadler commented Jan 27, 2025

I have a hunch, that one of the issues is the use of customizeDiff in the upstream provider. customizeDiff does not support unknowns: CustomizeDiff does not currently support computed/"known after apply" values from other resource attributes. (from Terraform docs).

So whenever one of computeConfig, kubernetesNetworkConfig.elasticLoadBalancing or storageConfig.blockStorage becomes unknown, when auto mode is enabled, it'll lead to errors.

I'm able to repro the same in Terraform using this. The crucial bit here is that the three different auto-mode settings need to either all be unknown, all be false, or all set to true. Upstream is not handling the unknown case correctly and interprets that as false. I'll cut a ticket with them.

terraform {
    required_providers {
        aws = {
            source  = "hashicorp/aws"
        }
    }
}

provider "aws" {
    region = "us-west-2"
}

resource "random_id" "example" {
  byte_length = 8
}

locals {
  # Use a conditional expression to create a computed boolean
  auto_mode_enabled = random_id.example.dec % 2 == 0
}

module "vpc" {
  source = "terraform-aws-modules/vpc/aws"

  name = "my-vpc"
  cidr = "10.0.0.0/16"

  azs             = ["us-west-2a", "us-west-2b"]
  private_subnets = ["10.0.1.0/24", "10.0.2.0/24"]
  public_subnets  = ["10.0.101.0/24", "10.0.102.0/24"]
}

resource "aws_eks_cluster" "example" {
  name = "auto-mode-issues"

  access_config {
    authentication_mode = "API"
  }

  role_arn = aws_iam_role.cluster.arn
  version  = "1.31"

  vpc_config {
    subnet_ids = [
      module.vpc.private_subnets[0],
      module.vpc.private_subnets[1],
    ]
  }

  dynamic "compute_config" {
    for_each = local.auto_mode_enabled ? [1] : []
    content {
      enabled = local.auto_mode_enabled
      node_role_arn = aws_iam_role.node.arn
      node_pools = ["general-purpose", "system"]
    }
  }

  kubernetes_network_config {
    elastic_load_balancing {
      enabled = true
    }
  }

  storage_config {
    block_storage {
      enabled = true
    }
  }

  depends_on = [
    aws_iam_role_policy_attachment.cluster_AmazonEKSClusterPolicy,
    aws_iam_role_policy_attachment.node_worker,
    aws_iam_role_policy_attachment.node_ecr_pull,
  ]
}

resource "aws_iam_role" "cluster" {
  name_prefix = "eks-cluster-example"
  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Action = [
          "sts:AssumeRole",
          "sts:TagSession"
        ]
        Effect = "Allow"
        Principal = {
          Service = "eks.amazonaws.com"
        }
      },
    ]
  })
}

resource "aws_iam_role_policy_attachment" "cluster_AmazonEKSClusterPolicy" {
  policy_arn = "arn:aws:iam::aws:policy/AmazonEKSClusterPolicy"
  role       = aws_iam_role.cluster.name
}

resource "aws_iam_role" "node" {
  name_prefix = "eks-cluster-node"
  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Action = [
          "sts:AssumeRole",
          "sts:TagSession"
        ]
        Effect = "Allow"
        Principal = {
          Service = "ec2.amazonaws.com"
        }
      },
    ]
  })
}

resource "aws_iam_role_policy_attachment" "node_worker" {
  policy_arn = "arn:aws:iam::aws:policy/AmazonEKSWorkerNodeMinimalPolicy"
  role       = aws_iam_role.node.name
}

resource "aws_iam_role_policy_attachment" "node_ecr_pull" {
  policy_arn = "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryPullOnly"
  role       = aws_iam_role.node.name
}

@flostadler
Copy link
Contributor Author

I created an upstream fix for it: hashicorp/terraform-provider-aws#41155

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting-upstream The issue cannot be resolved without action in another repository (may be owned by Pulumi). blocked The issue cannot be resolved without 3rd party action. kind/bug Some behavior is incorrect or out of spec
Projects
None yet
Development

No branches or pull requests

1 participant