CI configuration internals

Workflow rules

Pipelines for the GitLab project are created using the workflow:rules keyword feature of the GitLab CI/CD.

Pipelines are always created for the following scenarios:

  • main branch, including on schedules, pushes, merges, and so on.
  • Merge requests.
  • Tags.
  • Stable, auto-deploy, and security branches.

Pipeline creation is also affected by the following CI/CD variables:

  • If $FORCE_GITLAB_CI is set, pipelines are created. Not recommended to use. See Avoid $FORCE_GITLAB_CI.
  • If $GITLAB_INTERNAL is not set, pipelines are not created.

No pipeline is created in any other cases (for example, when pushing a branch with no MR for it).

The source of truth for these workflow rules is defined in .gitlab-ci.yml.

Avoid $FORCE_GITLAB_CI

The pipeline is very complex and we need to clearly understand the kind of pipeline we want to trigger. We need to know which jobs we should run and which ones we shouldn’t.

If we use $FORCE_GITLAB_CI to force trigger a pipeline, we don’t really know what kind of pipeline it is. The result can be that we don’t run the jobs we want, or we run too many jobs we don’t care about.

Some more context and background can be found at: Avoid blanket changes to avoid unexpected run

Here’s a list of where we’re using this right now, and should try to move away from using $FORCE_GITLAB_CI.

See the next section for how we can enable pipelines without using $FORCE_GITLAB_CI.

Alternative to $FORCE_GITLAB_CI

Essentially, we use different variables to enable different pipelines. An example doing this is $START_AS_IF_FOSS. When we want to trigger a cross project FOSS pipeline, we set $START_AS_IF_FOSS, along with a set of other variables like $ENABLE_RSPEC_UNIT, $ENABLE_RSPEC_SYSTEM, and so on so forth to enable each jobs we want to run in the as-if-foss cross project downstream pipeline.

The advantage of this over $FORCE_GITLAB_CI is that we have full control over how we want to run the pipeline because $START_AS_IF_FOSS is only used for this purpose, and changing how the pipeline behaves under this variable will not affect other types of pipelines, while using $FORCE_GITLAB_CI we do not know what exactly the pipeline is because it’s used for multiple purposes.

Default image

The default image is defined in .gitlab-ci.yml.

It includes Ruby, Go, Git, Git LFS, Chrome, Node, Yarn, PostgreSQL, and Graphics Magick.

The images used in our pipelines are configured in the gitlab-org/gitlab-build-images project, which is push-mirrored to gitlab/gitlab-build-images for redundancy.

The current version of the build images can be found in the “Used by GitLab section”.

Default variables

In addition to the predefined CI/CD variables, each pipeline includes default variables defined in .gitlab-ci.yml.

Stages

The current stages are:

  • sync: This stage is used to synchronize changes from gitlab-org/gitlab to gitlab-org/gitlab-foss.
  • prepare: This stage includes jobs that prepare artifacts that are needed by jobs in subsequent stages.
  • build-images: This stage includes jobs that prepare Docker images that are needed by jobs in subsequent stages or downstream pipelines.
  • fixtures: This stage includes jobs that prepare fixtures needed by frontend tests.
  • lint: This stage includes linting and static analysis jobs.
  • test: This stage includes most of the tests, and DB/migration jobs.
  • post-test: This stage includes jobs that build reports or gather data from the test stage’s jobs (for example, coverage, Knapsack metadata, and so on).
  • review: This stage includes jobs that build the CNG images, deploy them, and run end-to-end tests against review apps (see review apps for details). It also includes Docs Review App jobs.
  • qa: This stage includes jobs that perform QA tasks against the Review App that is deployed in stage review.
  • post-qa: This stage includes jobs that build reports or gather data from the qa stage’s jobs (for example, Review App performance report).
  • pages: This stage includes a job that deploys the various reports as GitLab Pages (for example, coverage-ruby, and webpack-report (found at https://gitlab-org.gitlab.io/gitlab/webpack-report/, but there is an issue with the deployment).
  • notify: This stage includes jobs that notify various failures to Slack.

Dependency Proxy

Some of the jobs are using images from Docker Hub, where we also use ${GITLAB_DEPENDENCY_PROXY_ADDRESS} as a prefix to the image path, so that we pull images from our Dependency Proxy. By default, this variable is set from the value of ${GITLAB_DEPENDENCY_PROXY}.

${GITLAB_DEPENDENCY_PROXY} is a group CI/CD variable defined in gitlab-org as ${CI_DEPENDENCY_PROXY_GROUP_IMAGE_PREFIX}/. This means when we use an image defined as:

image: ${GITLAB_DEPENDENCY_PROXY_ADDRESS}alpine:edge

Projects in the gitlab-org group pull from the Dependency Proxy, while forks that reside on any other personal namespaces or groups fall back to Docker Hub unless ${GITLAB_DEPENDENCY_PROXY} is also defined there.

Work around for when a pipeline is started by a Project access token user

When a pipeline is started by a Project access token user (for example, the release-tools approver bot user which automatically updates the Gitaly version used in the main project), the Dependency proxy isn’t accessible and the job fails at the Preparing the "docker+machine" executor step. To work around that, we have a special workflow rule, that overrides the ${GITLAB_DEPENDENCY_PROXY_ADDRESS} variable so that Dependency proxy isn’t used in that case:

- if: '$CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH && $GITLAB_USER_LOGIN =~ /project_\d+_bot\d*/'
  variables:
    GITLAB_DEPENDENCY_PROXY_ADDRESS: ""
note
We don’t directly override the ${GITLAB_DEPENDENCY_PROXY} variable because group-level variables have higher precedence over .gitlab-ci.yml variables.

Common job definitions

Most of the jobs extend from a few CI definitions defined in .gitlab/ci/global.gitlab-ci.yml that are scoped to a single configuration keyword.

Job definitions Description
.default-retry Allows a job to retry upon unknown_failure, api_failure, runner_system_failure, job_execution_timeout, or stuck_or_timeout_failure.
.default-before_script Allows a job to use a default before_script definition suitable for Ruby/Rails tasks that may need a database running (for example, tests).
.repo-from-artifacts Allows a job to fetch the repository from artifacts in clone-gitlab-repo instead of cloning. This should reduce GitLab.com Gitaly load and also slightly improve the speed because downloading from artifacts is faster than cloning. Note that this should be avoided to be used with jobs having needs: [] because otherwise it’ll start later and we normally want all jobs to start as soon as possible. Use this only on jobs which has other dependencies so that we don’t wait longer than just cloning. Note that this behavior can be controlled via CI_FETCH_REPO_GIT_STRATEGY. See Fetch repository via artifacts instead of cloning/fetching from Gitaly for more details.
.setup-test-env-cache Allows a job to use a default cache definition suitable for setting up test environment for subsequent Ruby/Rails tasks.
.ruby-cache Allows a job to use a default cache definition suitable for Ruby tasks.
.static-analysis-cache Allows a job to use a default cache definition suitable for static analysis tasks.
.ruby-gems-coverage-cache Allows a job to use a default cache definition suitable for coverage tasks.
.qa-cache Allows a job to use a default cache definition suitable for QA tasks.
.yarn-cache Allows a job to use a default cache definition suitable for frontend jobs that do a yarn install.
.assets-compile-cache Allows a job to use a default cache definition suitable for frontend jobs that compile assets.
.use-pg13 Allows a job to use the postgres 13, redis, and rediscluster services (see .gitlab/ci/global.gitlab-ci.yml for the specific versions of the services).
.use-pg13-ee Same as .use-pg13 but also use an elasticsearch service (see .gitlab/ci/global.gitlab-ci.yml for the specific version of the service).
.use-pg14 Allows a job to use the postgres 14, redis, and rediscluster services (see .gitlab/ci/global.gitlab-ci.yml for the specific versions of the services).
.use-pg14-ee Same as .use-pg14 but also use an elasticsearch service (see .gitlab/ci/global.gitlab-ci.yml for the specific version of the service).
.use-pg15 Allows a job to use the postgres 15, redis, and rediscluster services (see .gitlab/ci/global.gitlab-ci.yml for the specific versions of the services).
.use-pg15-ee Same as .use-pg15 but also use an elasticsearch service (see .gitlab/ci/global.gitlab-ci.yml for the specific version of the service).
.use-pg16 Allows a job to use the postgres 16, redis, and rediscluster services (see .gitlab/ci/global.gitlab-ci.yml for the specific versions of the services).
.use-pg16-ee Same as .use-pg16 but also use an elasticsearch service (see .gitlab/ci/global.gitlab-ci.yml for the specific version of the service).
.use-kaniko Allows a job to use the kaniko tool to build Docker images.
.as-if-foss Simulate the FOSS project by setting the FOSS_ONLY='1' CI/CD variable.
.use-docker-in-docker Allows a job to use Docker in Docker. For more details, see the handbook about CI/CD configuration.

rules, if: conditions and changes: patterns

We’re using the rules keyword extensively.

All rules definitions are defined in rules.gitlab-ci.yml, then included in individual jobs via extends.

The rules definitions are composed of if: conditions and changes: patterns, which are also defined in rules.gitlab-ci.yml and included in rules definitions via YAML anchors

if: conditions

if: conditions Description Notes
if-not-canonical-namespace Matches if the project isn’t in the canonical (gitlab-org/ and gitlab-cn/) or security (gitlab-org/security) namespace. Use to create a job for forks (by using when: on_success or when: manual), or not create a job for forks (by using when: never).
if-not-ee Matches if the project isn’t EE (that is, project name isn’t gitlab or gitlab-ee). Use to create a job only in the FOSS project (by using when: on_success or when: manual), or not create a job if the project is EE (by using when: never).
if-not-foss Matches if the project isn’t FOSS (that is, project name isn’t gitlab-foss, gitlab-ce, or gitlabhq). Use to create a job only in the EE project (by using when: on_success or when: manual), or not create a job if the project is FOSS (by using when: never).
if-default-refs Matches if the pipeline is for master, main, /^[\d-]+-stable(-ee)?$/ (stable branches), /^\d+-\d+-auto-deploy-\d+$/ (auto-deploy branches), /^security\// (security branches), merge requests, and tags. Note that jobs aren’t created for branches with this default configuration.
if-master-refs Matches if the current branch is master or main.  
if-master-push Matches if the current branch is master or main and pipeline source is push.  
if-master-schedule-maintenance Matches if the current branch is master or main and pipeline runs on a 2-hourly schedule.  
if-master-schedule-nightly Matches if the current branch is master or main and pipeline runs on a nightly schedule.  
if-auto-deploy-branches Matches if the current branch is an auto-deploy one.  
if-master-or-tag Matches if the pipeline is for the master or main branch or for a tag.  
if-merge-request Matches if the pipeline is for a merge request.  
if-merge-request-title-as-if-foss Matches if the pipeline is for a merge request and the MR has label ~"pipeline:run-as-if-foss"  
if-merge-request-title-update-caches Matches if the pipeline is for a merge request and the MR has label ~"pipeline:update-cache".  
if-merge-request-labels-run-all-rspec Matches if the pipeline is for a merge request and the MR has label ~"pipeline:run-all-rspec".  
if-merge-request-labels-run-cs-evaluation Matches if the pipeline is for a merge request and the MR has label ~"pipeline:run-CS-evaluation".  
if-security-merge-request Matches if the pipeline is for a security merge request.  
if-security-schedule Matches if the pipeline is for a security scheduled pipeline.  
if-nightly-master-schedule Matches if the pipeline is for a master scheduled pipeline with $NIGHTLY set.  
if-dot-com-gitlab-org-schedule Limits jobs creation to scheduled pipelines for the gitlab-org group on GitLab.com.  
if-dot-com-gitlab-org-master Limits jobs creation to the master or main branch for the gitlab-org group on GitLab.com.  
if-dot-com-gitlab-org-merge-request Limits jobs creation to merge requests for the gitlab-org group on GitLab.com.  
if-dot-com-ee-schedule Limits jobs to scheduled pipelines for the gitlab-org/gitlab project on GitLab.com.  

changes: patterns

changes: patterns Description
ci-patterns Only create job for CI configuration-related changes.
ci-build-images-patterns Only create job for CI configuration-related changes related to the build-images stage.
ci-review-patterns Only create job for CI configuration-related changes related to the review stage.
ci-qa-patterns Only create job for CI configuration-related changes related to the qa stage.
yaml-lint-patterns Only create job for YAML-related changes.
docs-patterns Only create job for docs-related changes.
frontend-dependency-patterns Only create job when frontend dependencies are updated (for example, package.json, and yarn.lock) changes.
frontend-patterns-for-as-if-foss Only create job for frontend-related changes that have impact on FOSS.
backend-patterns Only create job for backend-related changes.
db-patterns Only create job for DB-related changes.
backstage-patterns Only create job for backstage-related changes (that is, Danger, fixtures, RuboCop, specs).
code-patterns Only create job for code-related changes.
qa-patterns Only create job for QA-related changes.
code-backstage-patterns Combination of code-patterns and backstage-patterns.
code-qa-patterns Combination of code-patterns and qa-patterns.
code-backstage-qa-patterns Combination of code-patterns, backstage-patterns, and qa-patterns.
static-analysis-patterns Only create jobs for Static Analytics configuration-related changes.

Best Practices

When to use extends:, <<: *xyz (YAML anchors), or !reference

Reference

Key takeaways

  • If you need to extend a hash, you should use extends
  • If you need to extend an array, you’ll need to use !reference, or YAML anchors as last resort
  • For more complex cases (for example, extend hash inside array, extend array inside hash, …), you’ll have to use !reference or YAML anchors

What can extends and YAML anchors do?

extends
  • Deep merge for hashes
  • NO merge for arrays. It overwrites (source)
YAML anchors
  • NO deep merge for hashes, BUT it can be used to extend a hash (see the example below)
  • NO merge for arrays, BUT it can be used to extend an array (see the example below)

A great example

This example shows how to extend complex YAML data structures with !reference and YAML anchors:

.strict-ee-only-rules:
  # `rules` is an array of hashes
  rules:
    - if: '$CI_PROJECT_NAME !~ /^gitlab(-ee)?$/ '
      when: never

# `if-security-merge-request` is a hash
.if-security-merge-request: &if-security-merge-request
  if: '$CI_PROJECT_NAMESPACE == "gitlab-org/security"'

# `code-qa-patterns` is an array
.code-qa-patterns: &code-qa-patterns
  - "{package.json,yarn.lock}"
  - ".browserslistrc"
  - "babel.config.js"
  - "jest.config.{base,integration,unit}.js"

.qa:rules:as-if-foss:
  rules:
    # We extend the `rules` array with an array of hashes directly
    - !reference [".strict-ee-only-rules", rules]
    # We extend a single array entry with a hash
    - <<: *if-security-merge-request
      # `changes` is an array, so we pass it an entire array
      changes: *code-qa-patterns

qa:selectors-as-if-foss:
  # We include the rules from .qa:rules:as-if-foss in this job
  extends:
    - .qa:rules:as-if-foss

Extend the .fast-no-clone-job job

Downloading the branch for the canonical project takes between 20 and 30 seconds.

Some jobs only need a limited number of files, which we can download via the GitLab API.

You can skip a job git clone/git fetch by adding the following pattern to a job.

Scenario 1: no before_script is defined in the job

This applies to the parent sections the job extends from as well.

You can just extend the .fast-no-clone-job:

Before:

  # Note: No `extends:` is present in the job
  a-job:
    script:
      - source scripts/rspec_helpers.sh scripts/slack
      - echo "No need for a git clone!"

After:

  # Note: No `extends:` is present in the job
  a-job:
    extends:
      - .fast-no-clone-job
    variables:
      FILES_TO_DOWNLOAD: >
        scripts/rspec_helpers.sh
        scripts/slack
    script:
      - source scripts/rspec_helpers.sh scripts/slack
      - echo "No need for a git clone!"

Scenario 2: a before_script block is already defined in the job (or in jobs it extends)

For this scenario, you have to:

  1. Extend the .fast-no-clone-job as in the first scenario (this will merge the FILES_TO_DOWNLOAD variable with the other variables)
  2. Make sure the before_script section from .fast-no-clone-job is referenced in the before_script we use for this job.

Before:

  .base-job:
    before_script:
      echo "Hello from .base-job"

  a-job:
    extends:
      - .base-job
    script:
      - source scripts/rspec_helpers.sh scripts/slack
      - echo "No need for a git clone!"

After:

  .base-job:
    before_script:
      echo "Hello from .base-job"

  a-job:
    extends:
      - .base-job
      - .fast-no-clone-job
    variables:
      FILES_TO_DOWNLOAD: >
        scripts/rspec_helpers.sh
        scripts/slack
    before_script:
      - !reference [".fast-no-clone-job", before_script]
      - !reference [".base-job", before_script]
    script:
      - source scripts/rspec_helpers.sh scripts/slack
      - echo "No need for a git clone!"

Caveats

  • This pattern does not work if a script relies on git to access the repository, because we don’t have the repository without cloning or fetching.
  • The job using this pattern needs to have curl available.
  • If you need to run bundle install in the job (even using BUNDLE_ONLY), you need to:
    • Download the gems that are stored in the gitlab-org/gitlab project.
      • You can use the download_local_gems shell command for that purpose.
    • Include the Gemfile, Gemfile.lock and Gemfile.checksum (if applicable)

Where is this pattern used?

  • For now, we use this pattern for the following jobs, and those do not block private repositories:
    • review-build-cng-env for:
      • GITALY_SERVER_VERSION
      • GITLAB_ELASTICSEARCH_INDEXER_VERSION
      • GITLAB_KAS_VERSION
      • GITLAB_PAGES_VERSION
      • GITLAB_SHELL_VERSION
      • scripts/trigger-build.rb
      • VERSION
    • review-deploy for:
      • GITALY_SERVER_VERSION
      • GITLAB_SHELL_VERSION
      • scripts/review_apps/review-apps.sh
      • scripts/review_apps/seed-dast-test-data.sh
      • VERSION
    • rspec:coverage for:
      • config/bundler_setup.rb
      • Gemfile
      • Gemfile.checksum
      • Gemfile.lock
      • scripts/merge-simplecov
      • spec/simplecov_env_core.rb
      • spec/simplecov_env.rb
    • prepare-as-if-foss-env for:
      • scripts/setup/generate-as-if-foss-env.rb

Additionally, scripts/utils.sh is always downloaded from the API when this pattern is used (this file contains the code for .fast-no-clone-job).

Runner tags

On GitLab.com, both unprivileged and privileged runners are available. For projects in the gitlab-org group and forks of those projects, only one of the following tags should be added to a job:

  • gitlab-org: Jobs randomly use privileged and unprivileged runners.
  • gitlab-org-docker: Jobs must use a privileged runner. If you need Docker-in-Docker support, use gitlab-org-docker instead of gitlab-org.

The gitlab-org-docker tag is added by the .use-docker-in-docker job definition above.

To ensure compatibility with forks, avoid using both gitlab-org and gitlab-org-docker simultaneously. No instance runners have both gitlab-org and gitlab-org-docker tags. For forks of gitlab-org projects, jobs will get stuck if both tags are supplied because no matching runners are available.

See the GitLab Repositories handbook page for more information.

Using the gitlab Ruby gem in the canonical project

When calling require 'gitlab' in the canonical project, it will require the lib/gitlab.rb file when $LOAD_PATH has lib, which happens when we’re loading the application (config/application.rb) or tests (spec/spec_helper.rb).

This means we’re not able to load the gitlab gem under the above conditions and even if we can, the constant name will conflict, breaking internal assumptions and causing random errors. If you are working on a script that is using the gitlab Ruby gem, you will need to take a few precautions:

1 - Conditional require of the gem

To avoid potential conflicts, only require the gitlab gem if the Gitlab constant isn’t defined:

# Bad
require 'gitlab'

# Good
if Object.const_defined?(:RSpec)
  # Ok, we're testing, we know we're going to stub `Gitlab`, so we just ignore
else
  require 'gitlab'

  if Gitlab.singleton_class.method_defined?(:com?)
    abort 'lib/gitlab.rb is loaded, and this means we can no longer load the client and we cannot proceed'
  end
end

2 - Mock the gitlab gem entirely in your specs

In your specs, require 'gitlab' will reference the lib/gitlab.rb file:

# Bad
allow(GitLab).to receive(:a_method).and_return(...)

# Good
client = double('GitLab')
# In order to easily stub the client, consider using a method to return the client.
# We can then stub the method to return our fake client, which we can further stub its methods.
#
# This is the pattern followed below
let(:instance) { described_class.new }

allow(instance).to receive(:gitlab).and_return(client)
allow(client).to receive(:a_method).and_return(...)

In case you need to query jobs for instance, the following snippet will be useful:

# Bad
allow(GitLab).to receive(:pipeline_jobs).and_return(...)

# Good
#
# rubocop:disable RSpec/VerifiedDoubles -- We do not load the Gitlab client directly
client = double('GitLab')
allow(instance).to receive(:gitlab).and_return(client)

jobs = ['job1', 'job2']
allow(client).to yield_jobs(:pipeline_jobs, jobs)

def yield_jobs(api_method, jobs)
  messages = receive_message_chain(api_method, :auto_paginate)

  jobs.inject(messages) do |stub, job_name|
    stub.and_yield(double(name: job_name))
  end
end
# rubocop:enable RSpec/VerifiedDoubles

3 - Do not call your script with bundle exec

Executing with bundle exec will change the $LOAD_PATH for Ruby, and it will load lib/gitlab.rb when calling require 'gitlab':

# Bad
bundle exec scripts/my-script.rb

# Good
scripts/my-script.rb

CI Configuration Testing

We now have RSpec tests to verify changes to the CI configuration by simulating pipeline creation with the updated YAML files. You can find these tests and a documentation of the current test coverage in spec/dot_gitlab_ci/job_dependency_spec.rb.

How Do the Tests Work

With the help of Ci::CreatePipelineService, we are able to simulate pipeline creation with different attributes such as branch name, MR labels, pipeline source (scheduled v.s push), pipeline type (merge train v.s merged results), etc. This is the same service utilized by the GitLab CI Lint API for validating CI/CD configurations.

These tests will automatically run for merge requests that update CI configurations. However, team members can opt to skip these tests by adding the label ~"pipeline:skip-ci-validation" to their merge requests.

Running these tests locally is encouraged, as it provides the fastest feedback.