Reputation: 4259
As I did not get anywhere with a standard GKE cluster via Terraform (see GKE permission issue on gcr.io with service account based on terraform), I have now created one with a separate node pool. However, I still cannot get a basic container pulled from an eu.gcr.io private repo.
My terraform yml is as follows.
resource "google_container_cluster" "primary" {
name = "gke-cluster"
location = "${var.region}-a"
node_locations = [
"${var.region}-b",
"${var.region}-c",
]
network = var.vpc_name
subnetwork = var.subnet_name
remove_default_node_pool = true
initial_node_count = 1
# minimum kubernetes version for master
min_master_version = var.min_master_version
master_auth {
username = var.gke_master_user
password = var.gke_master_pass
}
}
resource "google_container_node_pool" "primary_preemptible_nodes" {
name = "gke-node-pool"
location = "${var.region}-a"
cluster = google_container_cluster.primary.name
version = var.node_version
node_count = 3
node_config {
preemptible = true
metadata = {
disable-legacy-endpoints = "true"
}
# based on project number
service_account = "[email protected]"
oauth_scopes = [
"https://www.googleapis.com/auth/compute",
"https://www.googleapis.com/auth/devstorage.read_only"
"https://www.googleapis.com/auth/logging.write",
"https://www.googleapis.com/auth/monitoring",
]
}
}
all creates very nicely. Then I want to deploy on the cluster with
I create these deployments with the following yml file (deployment.yml)
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-deployment
spec:
replicas: 1
selector:
matchLabels:
component: api
template:
metadata:
labels:
component: api
spec:
containers:
- name: api
image: eu.gcr.io/project-dev/api:latest
imagePullPolicy: Always
ports:
- containerPort: 5060
and it continues to give:
Failed to pull image "eu.gcr.io/project-dev/api:latest": rpc error: code =
Unknown desc = Error response from daemon: pull access denied for eu.gcr.io/project-dev/api,
repository does not exist or may require 'docker login': denied: Permission denied for
"latest" from request "/v2/project-dev/lcm_api/manifests/latest".
Warning Failed 94s (x2 over 111s) kubelet, gke-cluster-dev-node-pool-90efd247-7vl4 Error: ErrImagePull
I have open cloud shell in kubernetes cluster and
docker pull eu.gcr.io/project-dev/api:latest
works just fine.
I am seriously running out of ideas here (and consider moving back to AWS). Could it have something to do with the permissions the container is pushed to eu.gcr.io?
I use:
docker login -u _json_key --password-stdin https://eu.gcr.io < /home/jeroen/.config/gcloud/tf_admin.json
locally where tf_admin.json is the service account of my administration project that created the infrastructure project. I then push
docker push eu.gcr.io/project-dev/api:latest
Another idea. From the documentation and other stackoverflow questions (see e.g. GKE - ErrImagePull pulling from Google Container Registry) it seems key to have the correct service account and oauth-scopes. How can I check that it is using the right service-account when pulling? And whether the scopes are correctly assigned?
Upvotes: 2
Views: 1460
Reputation: 1714
Maybe someone finds this useful: in my case, the SA for the cluster was missing the roles/storage.objectViewer
role.
Upvotes: 0
Reputation: 2265
Seems official terraform example with OAuth scopes is outdated and shouldn't be used. My fix is to grant all permissions via OAuth scopes and use IAM roles to manage it instead:
oauth_scopes = [
"https://www.googleapis.com/auth/cloud-platform",
]
You can check similar issue also.
Upvotes: 4