Reputation: 91
We run gitlab-ee-12.10.12.0 under docker and use kubernetes to manage the gitlab-runner
All of a sudden a couple of days ago, all my pipelines, in all my projects, stopped working. NOTHING CHANGED except I pushed some code. Yet ALL projects (even those with no repo changes) are failing. I've looked at every certificate I can find anywhere in the system and they're all good so it wasn't a cert expiry. Disk space is at 45% so it's not that. Nobody logged into the server. Nobody touched any admin screens. One code push triggered the pipeline successfully, next one didn't. I've looked at everything. I've updated the docker images for gitlab and gitlab-runner. I've deleted every kubernetes pod I can find in the namespace and let them get relaunched (my go-to for solving k8s problems :-) ).
Every pipeline run in every project now says this:
Running with gitlab-runner 14.3.2 (e0218c92)
on Kubernetes Runner vXpkH225
Preparing the "kubernetes" executor
00:00
Using Kubernetes namespace: gitlab
Using Kubernetes executor with image lxnsok01.wg.dir.telstra.com:9000/broadworks-build:latest ...
Using attach strategy to execute scripts...
Preparing environment
00:00
ERROR: Error cleaning up configmap: resource name may not be empty
ERROR: Job failed (system failure): prepare environment: setting up build pod: error setting ownerReferences: configmaps "runner-vxpkh225-project-47-concurrent-0-scripts9ds4c" is forbidden: User "system:serviceaccount:gitlab:gitlab" cannot update resource "configmaps" in API group "" in the namespace "gitlab". Check https://docs.gitlab.com/runner/shells/index.html#shell-profile-loading for more information
That URL talks about bash logout scripts containing bad things. But nothing changed. At least we didn't change anything. I believe the second error implying that the user doesn't have permissions is not correct. It seems to just be saying that the user couldn't do it. The primary error being the previous one about the configmaps clean up. Again, no serviceaccounts, roles, rolebindings, etc have changed in any way.
So I'm trying to work out what may CAUSE that error. What does it MEAN? What resource name is empty? Where can I find out?
I've checked the output from "docker container logs " and it says exactly what's in the error above. No more, no less.
The only thing I can think of is perhaps 14.3.2 of gitlab-runner doesn't like my k8s or the config. Going back and checking, it seems this has changed. Previous working pipelines ran in 14.1.
So two questions then: 1) Any ideas how to fix the problem (eg update some config, clear some crud, whatever) and 2) How to I get gitlab to use a runner other than :latest?
Upvotes: 1
Views: 2828
Reputation: 91
Turns out something DID change. gitlab-runner changed and kubernetes pulled gitlab/gitlab-runner:latest between runs. Seems gitlab-runner 14.3 has a problem with my kubernetes. I went back through my pipelines and the last successful one was using 14.1
So, after a day of working through it, I edited the relevant k8s deployment to redefine the image tag used for gitlab-runner to :v14.1.0 which is the last one that worked for me.
Maybe I'll wait a few weeks and try a later one (now that I know how to easily change that tag) and see if the issue gets fixed. And perhaps go raise an issue on gitlab-runner
Upvotes: 2