Cleanup policies

Cleanup policies are recurrent background processes that automatically remove objects according to some parameters set by users.

Container registry

Cleanup policies for the container registry work on all the container repositories hosted in a single project. All tags that match the cleanup parameters are removed.

Parameters

The ContainerExpirationPolicy holds all parameters for the container registry cleanup policies.

The parameters are split into two groups:

  • The parameters that define tags to keep:
    • keep_n. Keep the n most recent tags.
    • name_regex_keep. Keep tags matching this regular expression.
  • The parameters that define tags to destroy:
    • older_than. Destroy tags older than this timestamp.
    • name_regex. Destroy tags matching this regular expression.

The remaining parameters impact when the policy is executed:

  • enabled. Defines if the policy is enabled or not.
  • cadence. Defines the execution cadence of the policy.
  • next_run_at. Defines when the next execution should happen.

Execution

Due to the large number of policies we need to process on GitLab.com, the execution follows this design.

  • Policy executions are limited in time.
  • Policy executions are either complete or partial.
  • The background jobs will consider the next job to be executed based on two priorities:
    • Policy with a next_run_at in the past.
    • Partially executed policies.

To track the cleanup policy status on a container repository, we have an expiration_policy_cleanup_status on the ContainerRepository model.

Background jobs for this execution are organized on:

  • A cron background job that runs every hour.
  • A set of background jobs that will loop on container repositories that need a policy execution.

The cron background job

The cron background job is quite simple. Its main tasks are:

  1. Check if there are any container repositories in need of a cleanup. If any, enqueue as many limited capacity jobs as necessary, up to a limit.
  2. Compute metrics for cleanup policies and log them.

The limited capacity job

This job is based on the limited capacity concern.

This job will run in parallel up to a specific capacity.

The primary responsibility of this job is to select the next container repository that requires cleaning and call the related service on it.

This is where the two priorities are evaluated in order. If a container repository is found, the cleanup service is called on it.

To ensure that only one cleaning is executed on a given container repository at any time, we use a database lock along with the expiration_policy_cleanup_status column.

This job will re-enqueue itself until no more container repositories require cleanup.

Services

Here is the services call that will happen from the limited capacity job:

flowchart TD job[Limited capacity job] --> cleanup([ContainerExpirationPolicies::CleanupService]) cleanup --> cleanup_tags([Projects::ContainerRepository::CleanupTagsService]) cleanup_tags --> delete_tags([Projects::ContainerRepository::DeleteTagsService])

The cleanup tags service uses a very specific execution order to build the list of tags to destroy.

Lastly, the cleanup tags service and delete tags service work using facades. The actual implementation depends on the type of container registry connected. If the GitLab container registry is connected, several improvements are available and used during cleanup policies execution, such as better use of the container registry API.