Terraform forces AKS node pool replacement without any changes

Question

I have the following resource definition for additional node pools in my k8s cluster:

resource "azurerm_kubernetes_cluster_node_pool" "extra" {
  for_each = var.node_pools

  kubernetes_cluster_id   = azurerm_kubernetes_cluster.k8s.id
  name                    = each.key
  vm_size                 = each.value["vm_size"]
  node_count              = each.value["count"]
  node_labels             = each.value["labels"]
  vnet_subnet_id          = var.subnet.id
}

Here is the output from terraform plan:

Note: Objects have changed outside of Terraform

Terraform detected the following changes made outside of Terraform since the last "terraform apply":

  # module.aks.azurerm_kubernetes_cluster_node_pool.extra["general"] has been changed
  ~ resource "azurerm_kubernetes_cluster_node_pool" "extra" {
      + availability_zones     = []
        id                     = "/subscriptions/3913c9fe-c571-4af9-bc9a-533202d41061/resourcegroups/amic-resources/providers/Microsoft.ContainerService/managedClusters/amic-k8s-01/agentPools/general"
        name                   = "general"
      + node_taints            = []
      + tags                   = {}
        # (18 unchanged attributes hidden)
    }

Unless you have made equivalent changes to your configuration, or ignored the relevant attributes using ignore_changes, the following plan may include actions to undo or respond to these changes.

──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
-/+ destroy and then create replacement

Terraform will perform the following actions:

  # module.aks.azurerm_kubernetes_cluster_node_pool.extra["general"] must be replaced
-/+ resource "azurerm_kubernetes_cluster_node_pool" "extra" {
      - availability_zones     = [] -> null
      - enable_auto_scaling    = false -> null
      - enable_host_encryption = false -> null
      - enable_node_public_ip  = false -> null
      ~ id                     = "/subscriptions/3913c9fe-c571-4af9-bc9a-533202d41061/resourcegroups/amic-resources/providers/Microsoft.ContainerService/managedClusters/amic-k8s-01/agentPools/general" -> (known after apply)
      ~ kubernetes_cluster_id  = "/subscriptions/3913c9fe-c571-4af9-bc9a-533202d41061/resourcegroups/amic-resources/providers/Microsoft.ContainerService/managedClusters/amic-k8s-01" -> "/subscriptions/3913c9fe-c571-4af9-bc9a-533202d41061/resourceGroups/amic-resources/providers/Microsoft.ContainerService/managedClusters/amic-k8s-01" # forces replacement
      - max_count              = 0 -> null
      ~ max_pods               = 30 -> (known after apply)
      - min_count              = 0 -> null
        name                   = "general"
      - node_taints            = [] -> null
      ~ orchestrator_version   = "1.20.7" -> (known after apply)
      ~ os_disk_size_gb        = 128 -> (known after apply)
      - tags                   = {} -> null
        # (9 unchanged attributes hidden)
    }

Plan: 1 to add, 0 to change, 1 to destroy.

As you can see, terraform tries to force replacement of my node pool because of a change in kubernetes_cluster_id, even though there is actually no change at all on this value. I've been able to work around this by ignoring kubernetes_cluster_id changes in the lifecycle block, but I am still puzzled as to why terraform detects a change there.

So why does Terraform find a change in this case while there are none?

Marcel Šer&#253; · Accepted Answer

I have fixed this weird bug by introducing lifecycle block as follows:

resource "azurerm_kubernetes_cluster_node_pool" "my-node-pool" {
  name = "mynodepool"
  kubernetes_cluster_id = azurerm_kubernetes_cluster.aks.id

  ...

  lifecycle {
    ignore_changes = [
      kubernetes_cluster_id
    ]
  }
}

Not the cleanest way, but it works. Cluster Id should not be changed unless you recreate whole AKS Cluster, so it should be safe.

Terraform forces AKS node pool replacement without any changes

Answers (2)

Related Questions