GitLab CI — Parallel with Matrix
Efficient Pipelines for Parallelisation
This feature has been introduced way back in 2020, if I recollect it correct, plus it has some great capabilities, especially in the area of parallelisation.
A detailed CI Guide is available in their Docs if you need to deep dive.
Lets now understand the “Matrix” feature..
CI Syntax
Instead of creating multiple jobs, resulting in blocked pipelines, wouldn’t it be easier and cleaner to run them in parallel?
YES! We can build it with using the parallel.matrix
in the CI YAML.
The benefit of using variables is that they are assigned values and integrated into CI/CD jobs as environment variables. The example below displays their values, but they can also be used for more advanced tasks, such as making conditional decisions.
stages:
- matrix
Matrix:
stage: matrix
image: alpine:latest
parallel:
matrix:
- CLOUD:
- aws
- azure
- gcp
ARCH:
- kubernetes
- service-mesh
script:
- echo "Hello from $ARCH from $CLOUD"
This is how the Pipeline would look like
Isn’t it GREAT?
Mixing parallel:matrix with !reference
Imagine we duplicate the variable values for use in additional jobs for later stages of deployments.
Prepare the template as below
.parallel-matrix:
parallel:
matrix:
- CLOUD:
- aws
- azure
- gcp
ARCH:
- kubernetes
- service-mesh
Now, lets merge this into the actual Pipeline
stages:
- matrix
.parallel-matrix:
parallel:
matrix:
- CLOUD:
- aws
- azure
- gcp
ARCH:
- kubernetes
- service-mesh
Matrix:
stage: matrix
image: alpine:latest
parallel: !reference [.parallel-matrix, parallel]
script:
- echo "Hello from $ARCH from $CLOUD"
THIS also WORKS!
Why not extends then?
Lets give it a try now
stages:
- matrix
.parallel-matrix:
parallel:
matrix:
- CLOUD:
- aws
- azure
- gcp
ARCH:
- kubernetes
- service-mesh
Matrix:
stage: matrix
image: alpine:latest
extends: .parallel-matrix
script:
- echo "Hello from $ARCH from $CLOUD"
IT STILL WORKS!
From the GitLab documentation this is what they explain
You can use
extends
to merge hashes but not arrays. The algorithm used for merge is “closest scope wins,” so keys from the last member always override anything defined on other levels.
Common Errors
Below is the error you will observe when you will try to have a matrix job exceeding 200 jobs:
parallel:matrix config generates too many jobs (maximum is 200)
The solution is to split the CI into different jobs and list them all for each configuration that would have in the matrix, leading to a unnecessarily large number of jobs; hence make the math before adding the jobs.
Final Words
Both extends and !reference have their benefits. It is recommended to evaluate and/or implement both techniques for your CI/CD processes.