Reputation: 11
We are making use of Bitbucket self hosted agent for pipelines for which we have installed Bitbucket runner in an EC2 machine of type Amazon Linux.
Pipelines were running properly for a month. Later pipeline containers started failing with Docker error 137.
Note: We scheduled yum updates to run every week in the runner machine.
So, the following are the action items I performed
Nothing helped me out. The pipelines continued to fail with the same error.
So, later, as the issue was unpredictable, upon help from AWS support team, I shifted all the runners to a ECS optimised AMI machine as this AMI is mostly designed for containers. Same thing happened with this machine too. After a month, pipelines started getting failed with the same error. This time, I re-investigated on the issue and checked manually with yum updates if there is any update which is getting mismatched with Docker.
Note: Runner container is up and running properly but the pipeline containers are getting failed with 137 which isn't because of OOM and Storage issue.
Later, we again moved it to a different Amazon Linux machine. Same issue for the third time after a month.
Nothing is helping out. Upon investigation, found that removing "Docker as a service" declaration in the pipeline is making the pipeline successful. The following is an example pipeline configuration.
- step:
name: generate dbt elementary report
image:
name: edr-build:latest
clone:
enabled: true
runs-on:
- 'linux'
- 'self.hosted'
- 'test'
services:
- docker
caches:
- docker
script:
- docker --version
- docker build -t <tag> .
- docker push to <ecr>
I cannot remove the services:docker section as I need to run docker commands in the pipeline.
And pipelines are getting failed all of a sudden. If there is really something to do with services:docker section or, if there is something to do with the runner machine, pipelines should fail with 137 error from the start.
I am not able to view any logs of the pipeline containers which are failing as they are getting killed immediately after the pipeline execution gets done. I am not able to list them as well with "docker ps -a". All the records of pipeline containers are getting erased which might be the functionality of Bitbucket runner.
Still not sure where to check and what to check with this error as this issue is occurring with multiple machines.
Note: Docker version is same in all the machines.
Please help me with this.
Upvotes: 1
Views: 172