Zero downtime upgrades

It’s possible to upgrade to a newer major, minor, or patch version of GitLab without having to take your GitLab instance offline. However, for this to work there are the following requirements:

  • You can only upgrade one minor release at a time. So from 13.1 to 13.2, not to 13.3. If you skip releases, database modifications may be run in the wrong sequence and leave the database schema in a broken state.
  • You have to use post-deployment migrations.
  • You are using PostgreSQL. Starting from GitLab 12.1, MySQL is not supported.
  • You have set up a multi-node GitLab instance. Single-node instances do not support zero-downtime upgrades.

If you want to upgrade multiple releases or do not meet the other requirements:

If you meet all the requirements above, follow these instructions in order. There are three sets of steps, depending on your deployment type:

Deployment type Description
Gitaly or Gitaly Cluster GitLab CE/EE using HA architecture for Gitaly or Gitaly Cluster
Multi-node / PostgreSQL HA GitLab CE/EE using HA architecture for PostgreSQL
Multi-node / Redis HA GitLab CE/EE using HA architecture for Redis
Geo GitLab EE with Geo enabled
Multi-node / HA with Geo GitLab CE/EE on multiple nodes

Each type of deployment requires that you hot reload the puma and sidekiq processes on all nodes running these services after you’ve upgraded. The reason for this is that those processes each load the GitLab Rails application which reads and loads the database schema into memory when starting up. Each of these processes needs to be reloaded (or restarted in the case of sidekiq) to re-read any database changes that have been made by post-deployment migrations.

Most of the time you can safely upgrade from a patch release to the next minor release if the patch release is not the latest. For example, upgrading from 14.1.1 to 14.2.0 should be safe even if 14.1.2 has been released. We do recommend you check the release posts of any releases between your current and target version just in case they include any migrations that may require you to upgrade one release at a time.

We also recommend you verify the version specific upgrading instructions relevant to your upgrade path.

Some releases may also include so called “background migrations”. These migrations are performed in the background by Sidekiq and are often used for migrating data. Background migrations are only added in the monthly releases.

Certain major/minor releases may require a set of background migrations to be finished. To guarantee this, such a release processes any remaining jobs before continuing the upgrading procedure. While this doesn’t require downtime (if the above conditions are met) we require that you wait for background migrations to complete between each major/minor release upgrade. The time necessary to complete these migrations can be reduced by increasing the number of Sidekiq workers that can process jobs in the background_migration queue. To see the size of this queue, Check for background migrations before upgrading.

As a guideline, any database smaller than 10 GB doesn’t take too much time to upgrade; perhaps an hour at most per minor release. Larger databases however may require more time, but this is highly dependent on the size of the database and the migrations that are being performed.

To help explain this, let’s look at some examples:

Example 1: You are running a large GitLab installation using version 13.4.2, which is the latest patch release of 13.4. When GitLab 13.5.0 is released this installation can be safely upgraded to 13.5.0 without requiring downtime if the requirements mentioned above are met. You can also skip 13.5.0 and upgrade to 13.5.1 after it’s released, but you can not upgrade straight to 13.6.0; you have to first upgrade to a 13.5.Z release.

Example 2: You are running a large GitLab installation using version 13.4.2, which is the latest patch release of 13.4. GitLab 13.5 includes some background migrations, and 14.0 requires these to be completed (processing any remaining jobs for you). Skipping 13.5 is not possible without downtime, and due to the background migrations would require potentially hours of downtime depending on how long it takes for the background migrations to complete. To work around this you have to upgrade to 13.5.Z first, then wait at least a week before upgrading to 14.0.

Example 3: You use MySQL as the database for GitLab. Any upgrade to a new major/minor release requires downtime. If a release includes any background migrations this could potentially lead to hours of downtime, depending on the size of your database. To work around this you must use PostgreSQL and meet the other online upgrade requirements mentioned above.

Multi-node / HA deployment

caution
You can only upgrade one minor release at a time. So from 13.6 to 13.7, not to 13.8. If you attempt more than one minor release, the upgrade may fail.

Use a load balancer in front of web (Puma) nodes

With Puma, single node zero-downtime updates are no longer possible. To achieve HA with zero-downtime updates, at least two nodes are required to be used with a load balancer which distributes the connections properly across both nodes.

The load balancer in front of the application nodes must be configured to check proper health check endpoints to check if the service is accepting traffic or not. For Puma, the /-/readiness endpoint should be used, while /readiness endpoint can be used for Sidekiq and other services.

Upgrades on web (Puma) nodes must be done in a rolling manner, one after another, ensuring at least one node is always up to serve traffic. This is required to ensure zero-downtime.

Puma enters a blackout period as part of the upgrade, during which nodes continue to accept connections but mark their respective health check endpoints to be unhealthy. On seeing this, the load balancer should disconnect them gracefully.

Puma restarts only after completing all the currently processing requests. This ensures data and service integrity. Once they have restarted, the health check end points are marked healthy.

The nodes must be updated in the following order to update an HA instance using load balancer to latest GitLab version.

  1. Select one application node as a deploy node and complete the following steps on it:

    1. Create an empty file at /etc/gitlab/skip-auto-reconfigure. This prevents upgrades from running gitlab-ctl reconfigure, which by default automatically stops GitLab, runs all database migrations, and restarts GitLab:

       sudo touch /etc/gitlab/skip-auto-reconfigure
      
    2. Update the GitLab package:

      # Debian/Ubuntu
      sudo apt-get update && sudo apt-get install gitlab-ee
      
      # Centos/RHEL
      sudo yum install gitlab-ee
      

      If you are a Community Edition user, replace gitlab-ee with gitlab-ce in the above command.

    3. Get the regular migrations and latest code in place. Before running this step, the deploy node’s /etc/gitlab/gitlab.rb configuration file must have gitlab_rails['auto_migrate'] = true to permit regular migrations.

      sudo SKIP_POST_DEPLOYMENT_MIGRATIONS=true gitlab-ctl reconfigure
      
    4. Ensure services use the latest code:

      sudo gitlab-ctl hup puma
      sudo gitlab-ctl restart sidekiq
      
  2. Complete the following steps on the other Puma/Sidekiq nodes, one after another. Always ensure at least one of such nodes is up and running, and connected to the load balancer before proceeding to the next node.

    1. Update the GitLab package and ensure a reconfigure is run as part of it. If not (due to /etc/gitlab/skip-auto-reconfigure file being present), run sudo gitlab-ctl reconfigure manually.

    2. Ensure services use latest code:

      sudo gitlab-ctl hup puma
      sudo gitlab-ctl restart sidekiq
      
  3. On the deploy node, run the post-deployment migrations:

       sudo gitlab-rake db:migrate
    

Gitaly or Gitaly Cluster

Gitaly nodes can be located on their own server, either as part of a sharded setup, or as part of Gitaly Cluster.

Before you update the main GitLab application you must (in order):

  1. Upgrade the Gitaly nodes that reside on separate servers.
  2. Upgrade Praefect if using Gitaly Cluster.

Upgrade Gitaly nodes

Upgrade the Gitaly nodes one at a time to ensure access to Git repositories is maintained:

# Debian/Ubuntu
sudo apt-get update && sudo apt-get install gitlab-ee

# Centos/RHEL
sudo yum install gitlab-ee

If you are a Community Edition user, replace gitlab-ee with gitlab-ce in the above command.

Upgrade Praefect

From the Praefect nodes, select one to be your Praefect deploy node. You install the new Omnibus package on the deploy node first and run database migrations.

  1. On the Praefect deploy node:

    1. Create an empty file at /etc/gitlab/skip-auto-reconfigure. This prevents upgrades from running gitlab-ctl reconfigure, which by default automatically stops GitLab, runs all database migrations, and restarts GitLab:

      sudo touch /etc/gitlab/skip-auto-reconfigure
      
    2. Ensure that praefect['auto_migrate'] = true is set in /etc/gitlab/gitlab.rb.

  2. On all remaining Praefect nodes, ensure that praefect['auto_migrate'] = false is set in /etc/gitlab/gitlab.rb to prevent reconfigure from automatically running database migrations.

  3. On the Praefect deploy node:

    1. Upgrade the GitLab package:

      # Debian/Ubuntu
      sudo apt-get update && sudo apt-get install gitlab-ee
      
      # Centos/RHEL
      sudo yum install gitlab-ee
      

      If you are a Community Edition user, replace gitlab-ee with gitlab-ce in the command above.

    2. To apply the Praefect database migrations and restart Praefect, run:

      sudo gitlab-ctl reconfigure
      
  4. On all remaining Praefect nodes:

    1. Upgrade the GitLab package:

      sudo apt-get update && sudo apt-get install gitlab-ee
      

      If you are a Community Edition user, replace gitlab-ee with gitlab-ce in the command above.

    2. Ensure nodes are running the latest code:

      sudo gitlab-ctl reconfigure
      

PostgreSQL

Pick a node to be the Deploy Node. It can be any application node, but it must be the same node throughout the process.

Deploy node

  • Create an empty file at /etc/gitlab/skip-auto-reconfigure. This prevents upgrades from running gitlab-ctl reconfigure, which by default automatically stops GitLab, runs all database migrations, and restarts GitLab.

    sudo touch /etc/gitlab/skip-auto-reconfigure
    

All nodes including the Deploy node

  • To prevent reconfigure from automatically running database migrations, ensure that gitlab_rails['auto_migrate'] = false is set in /etc/gitlab/gitlab.rb.

Postgres only nodes

  • Update the GitLab package

    # Debian/Ubuntu
    sudo apt-get update && sudo apt-get install gitlab-ee
    
    # Centos/RHEL
    sudo yum install gitlab-ee
    

    If you are a Community Edition user, replace gitlab-ee with gitlab-ce in the above command.

  • Ensure nodes are running the latest code

    sudo gitlab-ctl reconfigure
    

Deploy node

  • Update the GitLab package

    # Debian/Ubuntu
    sudo apt-get update && sudo apt-get install gitlab-ee
    
    # Centos/RHEL
    sudo yum install gitlab-ee
    

    If you are a Community Edition user, replace gitlab-ee with gitlab-ce in the above command.

  • If you’re using PgBouncer:

    You need to bypass PgBouncer and connect directly to the database leader before running migrations.

    Rails uses an advisory lock when attempting to run a migration to prevent concurrent migrations from running on the same database. These locks are not shared across transactions, resulting in ActiveRecord::ConcurrentMigrationError and other issues when running database migrations using PgBouncer in transaction pooling mode.

    To find the leader node, run the following on a database node:

    sudo gitlab-ctl patroni members
    

    Then, in your gitlab.rb file on the deploy node, update gitlab_rails['db_host'] and gitlab_rails['db_port'] with the database leader’s host and port.

  • To get the regular database migrations and latest code in place, run

    sudo gitlab-ctl reconfigure
    sudo SKIP_POST_DEPLOYMENT_MIGRATIONS=true gitlab-rake db:migrate
    

All nodes excluding the D