GitLab Runner instance group autoscaler
- Tier: Free, Premium, Ultimate
- Offering: GitLab.com, GitLab Self-Managed, GitLab Dedicated
GitLab Runner instance group autoscaler is the successor to the autoscaling technology based on Docker Machine. The components of the GitLab Runner instance group autoscaling solution are:
- Taskscaler: Manages the autoscaling logic, bookkeeping, and creates fleets for runner instances that use cloud provider autoscaling groups of instances.
- Fleeting: An abstraction for cloud provider virtual machines.
- Cloud provider plugin: Handles the API calls to the target cloud platform and is implemented using a plugin development framework.
Instance group autoscaling in GitLab Runner works as follows:
- The runner manager continuously polls GitLab jobs.
- In response, GitLab sends job payloads to the runner manager.
- The runner manager interacts with the public cloud infrastructure to create a new instance to execute jobs.
- The runner manager distributes these jobs to the available runners in the autoscaling pool.
Configure the runner manager
You must configure the runner manager to use the GitLab Runner instance group autoscaler.
Create an instance to host the runner manager. This must not be a spot instance (AWS), or spot virtual machine (GCP or Azure).
Install GitLab Runner on the instance.
Add the cloud provider credentials to the runner manager host machine.
You can host the runner manager in a container. For GitLab.com and GitLab Dedicated hosted runners, the runner manager is hosted on a virtual machine instance.
Example credentials configuration for GitLab Runner instance group autoscaler
You can use an AWS Identity and Access Management (IAM) instance profile for the runner manager in the AWS environment. If you do not want to host the runner manager in AWS, you can use a credentials file.
For example:
## credentials_file
[default]
aws_access_key_id=__REDACTED__
aws_secret_access_key=__REDACTED__
The credentials file is optional.
Supported public cloud instances
The following autoscaling options are supported for public cloud compute instances:
- Amazon Web Services EC2 instances
- Google Compute Engine
- Microsoft Azure Virtual Machines
These cloud instances are supported by the GitLab Runner Docker Machine autoscaler as well.
Supported platforms
Executor | Linux | macOS | Windows |
---|---|---|---|
Instance executor | Yes | Yes | Yes |
Docker Autoscaler executor | Yes | No | Yes |
Autoscaling algorithm and parameters
The autoscaling algorithm is based on these parameters:
IdleCount
IdleCountMin
IdleScaleFactor
IdleTime
MaxGrowthRate
limit
Any machine not running a job is considered to be idle. When GitLab Runner is in autoscale mode,
it monitors all machines and ensures that there is always an IdleCount
of idle machines.
If there is an insufficient number of idle machines, GitLab Runner
starts provisioning new machines, subject to the MaxGrowthRate
limit.
Requests for machines above the MaxGrowthRate
value are put on hold
until the number of machines being created falls below MaxGrowthRate
.
At the same time, GitLab Runner is checking the duration of the idle state of
each machine. If the time exceeds the IdleTime
value, the machine is
automatically removed.
Example configuration
Consider a GitLab Runner configured with the following autoscale parameters:
[[runners]]
limit = 10
# (...)
executor = "docker+machine"
[runners.machine]
MaxGrowthRate = 1
IdleCount = 2
IdleTime = 1800
# (...)
In the beginning, when no jobs are queued, GitLab Runner starts two machines
(IdleCount = 2
), and sets them in idle state. Also, the IdleTime
is set
to 30 minutes (IdleTime = 1800
).
Now, assume that five jobs are queued in GitLab CI/CD. The first two jobs are
sent to the idle machines of which we have two. GitLab Runner starts new machines as it now notices that
the number of idle is less than IdleCount
(0 < 2
). These machines are provisioned sequentially,
to prevent exceeding the MaxGrowthRate
.
The remaining three jobs are assigned to the first machine that is ready. As an optimization, this can be a machine that was busy, but has now completed its job, or it can be a newly provisioned machine. For this example, assume that provisioning is fast and the new machines are ready before any earlier jobs complete.
We now have one idle machine, so GitLab Runner starts one new machine to
satisfy IdleCount
. Because there are no new jobs in queue, those two
machines stay in idle state and GitLab Runner is satisfied.
What happened:
In the example, there are two machines waiting in idle state for new jobs. After the five jobs are queued, new machines are created. So, in total there are seven machines: five running jobs and two in idle state waiting for the next jobs.
The autoscaling algorithm works the same way. GitLab Runner creates a new
idle machine for each machine used for the job execution, until IdleCount
is satisfied. Machines are created up to the number defined by the
limit
parameter. When GitLab Runner detects that this limit
has been reached,
it stops autoscaling. The new jobs must wait in the job queue until machines
start returning to idle state.
In the above example, two idle machines are always available. The IdleTime
parameter
applies only when the number exceeds IdleCount
. At this point, GitLab Runner reduces
the number of machines to match IdleCount
.
Scaling down:
After the job finishes, the machine is set to idle state and waits
for new jobs to be executed. If no new jobs appear in the queue,
idle machines are removed after the time specified by IdleTime
.
In this example, all machines are removed after 30 minutes of inactivity
(measured from when each machine’s last job execution ended). GitLab
Runner maintains an IdleCount
of idle machines running, just like
at the beginning of the example.
The autoscaling algorithm works as follows:
- GitLab Runner starts.
- GitLab Runner creates two idle machines.
- GitLab Runner picks one job.
- GitLab Runner creates one more machine to maintain two idle machines.
- The picked job finishes, resulting in three idle machines.
- When one of the three idle machines exceeds
IdleTime
from the time after it picked the last job, it is removed. - GitLab Runner always maintains at least two idle machines for quick job processing.
The following chart illustrates the states of machines and builds (jobs) in time:
Docs
Edit this page to fix an error or add an improvement in a merge request.
Create an issue to suggest an improvement to this page.
Product
Create an issue if there's something you don't like about this feature.
Propose functionality by submitting a feature request.
Feature availability and product trials
View pricing to see all GitLab tiers and features, or to upgrade.
Try GitLab for free with access to all features for 30 days.
Get help
If you didn't find what you were looking for, search the docs.
If you want help with something specific and could use community support, post on the GitLab forum.
For problems setting up or using this feature (depending on your GitLab subscription).
Request support