Can a Gitlab CI pipeline job be configured as automatic when the prior stage succeeds, but manual otherwise?

Question

I use Gitlab CI to deploy my service. Currently I have a single deploy job that releases my changes to every server behind a load balancer.

I would like do a phased rollout where I deploy to one server in my load balancer, give it a few minutes to bake and set off any alarms if there is an issue, and then automatically continue deploying to the remaining servers. If any issue occurred before the delayed full automatic deploy happened I would manually cancel that job to prevent the bad change from going out more widely.

With this goal in mind I configured my pipeline with the following .gitlab-ci.yml:

stages:
 - canary_deploy
 - full_deploy

canary:
 stage: canary_deploy
 allow_failure: false
 when: manual
 script: make deploy-canary

full:
 stage: full_deploy
 when: delayed
 start_in: 10 minutes
 script: make deploy-full

This works relatively well but I ran into a problem when I tried to push a critical change out quickly. The canary deploy script was hanging and this prevented the second job from starting as it must wait for the first stage to complete. In this case I would have preferred to skip the canary entirely but because of the way the pipeline is configured it was not possible to manually invoke the full deploy.

Ideally I would like the full_deploy stage to run on the typical delay but allow me to forcefully start it if I didn't want to wait. I've reviewed the rules and needs and when configuration options hoping to find a way to achieve my goal but I haven't been able to find a working solution.

Some things I've tried, without luck:

I could create a duplicate full_deploy job which is manual and does not depend on the canary_deploy stage but it feels a bit hacky. And in reality my configuration is a bit more complex than what I've distilled here so there are actually several region-specific deploy jobs and I would prefer not to have to duplicate each of them.
I tried to use rules to consider the status of the prior stage and make the full_deploy manual unless the prior stage was successful. This isn't possible because rules are executed on pipeline creation and cannot dynamically adjust this property at runtime.
I changed the canary_deploy to allow failure, which effectively unblocked the second stage immediately. The problem here is that it caused the delay timer to start counting down immediately upon pipeline creation rather than waiting for the first stage to complete.

Bernhard · Accepted Answer

One thing you could do to make duplicating the full_deploy job feel a little bit less "hacky" is to define it once and then use extends two times:

stages:
 - canary_deploy
 - full_deploy

.full:
  script: make deploy-full

canary:
 stage: canary_deploy
 allow_failure: false
 when: manual
 script: make deploy-canary

full_automatic:
  extends: .full
  stage: full_deploy
  when: delayed
  start_in: 10 minutes

full_manual:
  stage: full_deploy
  extends: .full
  when: manual
  needs: []

This way, you only need to define the scripts section once and both the full_manual and the full_automatic job use it. When running the pipeline, you can choose which job to run first (manual versus canary):

Screenshot of the GitLab UI for selecting which job to run

By specifying needs: [], you tell GitLab that the full_manual job does not depend on any other jobs and can be executed immediately without running jobs from canary_deploy before.

When executing full_manual, the canary job is not executed:

Overview of executed pipeline jobs

Can a Gitlab CI pipeline job be configured as automatic when the prior stage succeeds, but manual otherwise?

Answers (1)

Related Questions