Reputation: 63
I assume there are no stupid questions, so here is one that I could not find a direct answer to.
The situation
I currently have a Kubernetes-cluster running 1.15.x on AKS, deployed and managed through Terraform. AKS recently Azure announced that they would retire the 1.15 version of Kubernetes on AKS, and I need to upgrade the cluster to 1.16 or later. Now, as I understand the situation, upgrading the cluster directly in Azure would have no consequences for the content of the cluster, I.E nodes, pods, secrets and everything else currently on there, but I can not find any proper answer to what would happen if I upgrade the cluster through Terraform.
Potential problems
So what could go wrong? In my mind, the worst outcome would be that the entire cluster would be destroyed, and a new one would be created. No pods, no secrets, nothing. Since there is so little information out there, I am asking here, to see if there are anyone with more experience with Terraform and Kubernetes that could potentially help me out.
To summary:
Terraform versions
Terraform v0.12.17
+ provider.azuread v0.7.0
+ provider.azurerm v1.37.0
+ provider.random v2.2.1
What I'm doing
§ terraform init
//running terrafrom plan with new Kubernetes version declared for AKS
§ terraform plan
//Following changes are announced by Terraform:
An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
~ update in-place
Terraform will perform the following actions:
#module.mycluster.azurerm_kubernetes_cluster.default will be updated in-place...
...
~ kubernetes_version = "1.15.5" -> "1.16.13"
...
Plan: 0 to add, 1 to change, 0 to destroy.
What I want to happen
Terraform will tell Azure to upgrade the existing AKS-service, not destroy before creating a new one. I assume that this will happen, as Terraform announces that it will "update in-place", instead of adding new and/or destroying existing clusters.
Upvotes: 6
Views: 10361
Reputation: 1
I tested what you described and we could not identify any interruption in service. All went smoothly until node pool did not get updated due to vcpu limit in region. So we asked for increase successfully. Despite the portal showing "latest model" no, terraform does not detect any changes.
Upvotes: 0
Reputation: 330
I found this question today and thought I'd add my experience as well. I made the following changes:
kubernetes_version
under azurerm_kubernetes_cluster
from "1.16.15" -> "1.17.16"orchestrator_version
under default_node_pool
from "1.16.15" -> "1.17.16"node_count
under default_node_pool
from 1 -> 2A terraform plan
showed that it was going to update in-place. I then performed a terraform apply
which completed successfully. kubectl get nodes
showed that an additional node was created, but both nodes in the pool were still on the old version. After further inspection in Azure Portal it was found that only the k8s cluster version was upgraded and not the version of the node pool. I then executed terraform plan
again and again it showed that the orchestrator_version
under default_node_pool
was going to be updated in-place. I then executed terraform apply
which then proceeded to upgrade the version of the node pool. It did that whole thing where it creates an additional node in the pool (with the new version) and sets the status to NodeSchedulable
while setting the existing node in the pool to NodeNotSchedulable
. The NodeNotSchedulable
node is then replaced by a new node with the new k8s version and eventually set to NodeSchedulable
. It did this for both nodes. Afterwards all nodes were upgraded without any noticeable downtime.
Upvotes: 9
Reputation: 859
I'd say this shows that the Terraform method is non-destructive, even if there have at times been oversights in the upgrade process (but still non-destructive in this example): https://github.com/terraform-providers/terraform-provider-azurerm/issues/5541
If you need higher confidence for this change then you could alternativly consider using the Azure-based upgrade method, refreshing the changes back into your state, and tweaking the code until a plan generation doesn't show anything intolerable. The two azurerm_kubernetes_cluster arguments dealing with version might be all you need to tweak.
Upvotes: 3