Benjamin
Benjamin

Reputation: 585

Parallel Matrix Blocking Gitlab Pipeline Execution

My company uses self-managed AWS auto-scaling Docker runners, via Docker Machine. This configuration is documented here

We have a single runner/runner-manager EC2 instance whose config.toml contains several different runner configs, all with different tags so that different groups in our Gitlab org get a dedicated runner, by use of runner tags, all from a single runner which spins up the appropriate executor for the corresponding tag in the job definition.

The runner for my group has been working flawlessly for months. Today I created a job using the parallel:matrix: keyword

Build Images:
  image: myimage
  stage: build
  script: 
    - docker build -f $DOCKERFILE -t $IMAGE_TAG
    - docker push $IMAGE_TAG
  parallel:
    matrix:
      - DOCKERFILE: $CI_PROJECT_DIR/Dockerfile
        IMAGE_TAG: myrepo/myimage:standard
      - DOCKERFILE: $CI_PROJECT_DIR/super.Dockerfile
        IMAGE_TAG: myrepo/myimage:super
  rules:
    - when: always

When I push a commit neither this job or any others which should run are getting triggered. No error message or anything. The CI/CD->Jobs page does not show any jobs either.

This is the config.toml used on the runner manager. The runner I am attempting to run this job with is the first runner "my-runner"

concurrent = 100
check_interval = 0

[session_server]
  session_timeout = 1800

[[runners]]
  name = "my-runner"
  limit = 6
  url = "https://gitlab.com"
  token = "XYZABC"
  executor = "docker+machine"
  [runners.custom_build_dir]
  [runners.cache]
    Type = "s3"
    Path = "cache"
    Shared = true
    [runners.cache.s3]
      ServerAddress = "s3.amazonaws.com"
      BucketName = "mybucket"
      BucketLocation = "us-east-1"
    [runners.cache.gcs]
    [runners.cache.azure]
  [runners.docker]
    tls_verify = false
    image = "alpine:latest"
    privileged = true
    disable_entrypoint_overwrite = false
    oom_kill_disable = false
    disable_cache = false
    volumes = ["/var/run/docker.sock:/var/run/docker.sock", "/cache"]
    shm_size = 0
  [runners.machine]
    IdleCount = 0
    IdleTime = 600
    MaxBuilds = 10
    MachineDriver = "amazonec2"
    MachineName = "gitlab-docker-machine-%s"
    MachineOptions = ["amazonec2-instance-type=t3.medium", "amazonec2-vpc-id=vpc-xxxxxxxx", "amazonec2-security-group=my-security-group", "amazonec2-iam-instance-profile=xxxxxx", "amazonec2-root-size=32", "amazonec2-ami=ami-218k65t87w8b6posq", "amazonec2-subnet-id=subnet-xxxxxxxxx", "amazonec2-zone=a"]

    [[runners.machine.autoscaling]]
      Periods = ["* * 13-23 * * mon-fri *"]
      Timezone = "UTC"
      IdleCount = 1
      IdleTime = 600

    [[runners.machine.autoscaling]]
      Periods = ["* * 2-11 * * * *"]
      Timezone = "UTC"
      IdleCount = 0
      IdleTime = 300

[[runners]]
  name = "other-runner"
  limit = 6
  url = "https://gitlab.com"
  token = "LMNOP"
  executor = "docker+machine"
...
...
...

There are several more runners defined in this config, but they are all very similar. Each is registered with different tags.

My Question: In the Gitlab CI docs it says

Multiple runners must exist, or a single runner must be configured to run multiple jobs concurrently

and to me it seems like multiple runners do exist, since the runner I am using has a limit of 6. Do the executors need to actually be spun up and sitting idle for this to work? Is there any way that I can get these parallel jobs to run without increasing my runner idle count?

Edit: Some additional information This is just one job of about a dozen in this file(load-tests.yml)
My gitlab-ci.yml file imports jobs from about 10 other files via

include:
    - local: .gitlab/load-test.yml

The pipeline never get created. If I comment out this job then the pipeline runs, including the other jobs in this file.

I can provide the entire file verbatim, but everything works fine if this job is not included. I'm fairly experienced with Gitlab-CI so I'm sure that the issue lies with this job and/or the runner config when using these keywords.

Tag and other keys are set in defaults in the .gitlab-ci.yml file. None of them are of significance, things like default variables, default before_script, cache, etc.

Edit: We are using Gitlab SAAS(premium I believe, but not sure) with the runner manager using Gitlab-Runner v14.1.0

Upvotes: 0

Views: 486

Answers (1)

iamlucas
iamlucas

Reputation: 257

I'm unable to replicate the issue, and it appears that the CI jobs are being generated as expected in the latest version of GitLab.

One potential concern I've identified is that variable expansion within the "parallel" keyword is currently unsupported. There's an open issue regarding this feature request. While it's unclear if this directly causes the problem of CI jobs not being created, it might be worth investigating.

You can refer to the documentation to see where variables can be utilized.

To work around the limitation of variable expansion in the "parallel" keyword, you can employ a method that involves "envsubst," a template file, and a trigger using artifacts. You can find a detailed example here gitlab not expanding variable used in parallel/matrix.

Upvotes: 1

Related Questions