Reputation: 3743
I had a terraform deployment that deployed GKE cluster pools on GCP and it stopped working.
Error: Error applying plan:
1 error(s) occurred:
* google_container_cluster.primary: 1 error(s) occurred:
* google_container_cluster.primary: Post
https://container.googleapis.com/v1/projects/...-gcp-poc/zones/europe-
west1-d/clusters?alt=json: dial tcp: i/o timeout
I can still deploy manually via console
I can still deploy it with gcloud cli
gcloud container clusters create cluster_name --zone europe-west1-b
I tried changing the credentials json file to no avail.
It happened after an upgrade from google plugin 1.4 to 1.5 My mac was restarted since.
Upvotes: 0
Views: 1648
Reputation: 973
In my case I got the error (Error: Failed to create deployment: Post https://32.244.226.151/apis/apps/v1/namespaces/default/deployments: dial tcp 35.242.229.150:443: i/o timeout
) when I tried to create a deployment for a cluster which I had just created (via terraform).
What solved the problem for me was to reconnect kubctl to the cluster:
gcloud container clusters list
gcloud container clusters get-credentials PUT_CLUSTER_NAME_HERE
UPDATE: I added this:
provider "kubernetes" {
host = "${google_container_cluster.primary.endpoint}"
client_certificate = "${base64decode(google_container_cluster.primary.master_auth.0.client_certificate)}"
client_key = "${base64decode(google_container_cluster.primary.master_auth.0.client_key)}"
cluster_ca_certificate = "${base64decode(google_container_cluster.primary.master_auth.0.cluster_ca_certificate)}"
}
and
/**
* Submit the job - Terraform doesn't yet support StatefulSets, so we have to
* shell out.
* See: https://github.com/sethvargo/vault-on-gke/blob/master/terraform/gcp.tf
*/
resource "null_resource" "apply" {
depends_on = ["google_container_node_pool.primary_preemptible_nodes"]
provisioner "local-exec" {
command = <<EOF
gcloud container clusters get-credentials "${google_container_cluster.primary.name}" \
--project="${google_container_cluster.primary.project}"
gcloud container clusters list
EOF
}
}
which solved the issue completely for me.
Note: my cluster resource is resource "google_container_cluster" "primary" { ... }
Upvotes: 0
Reputation: 3743
I ended up deleting the .terraform folder and replacing it with an old one I had with google plugin 1.4
terraform init
terraform plan
terraform apply
This worked even though I got this error:
Error: Error applying plan:
1 error(s) occurred:
* google_container_cluster.rtp_container_cluster: 1 error(s) occurred:
* google_container_cluster.rtp_container_cluster: Error reading
instance group manager returned as an instance group URL: Get
https://www.googleapis.com/compute/v1/projects/rtp-gcp-
poc/zones/europe-west1-b/instanceGroupManagers/gke-rtp-container-
cluste-default-pool-8bb9aa85-grp?alt=json: dial tcp: i/o timeout
Terraform does not automatically rollback in the face of errors.
Instead, your Terraform state file has been partially updated with
any resources that successfully completed. Please address the error
above and apply again to incrementally change your infrastructure.
then I connected via kubectl
➜ ~ kubectl get node
NAME STATUS ROLES
AGE VERSION
gke-...-container-cluste-default-pool-8bb9aa85-7kcb Ready <none>
14m v1.8.6-gke.0
I tried
terraform apply
again and presto the deployment finished.
Since I have excellent connectivity this smels like a google plugin bug to me.
Upvotes: 0