GitLab Geo

GitLab Geo provides the ability to have geographically distributed application deployments.

While external database services can be used, these documents focus on the use of the Omnibus GitLab for PostgreSQL to provide the most platform agnostic guide, and make use of the automation included in gitlab-ctl.

In this guide, both clusters have the same external URL. See Set up a Unified URL for Geo sites.

note
See the defined terms to describe all aspects of Geo (mainly the distinction between site and node).

Requirements

To use GitLab Geo with the GitLab Helm chart, the following requirements must be met:

  • The use of external PostgreSQL services, as the PostgresSQL included with the chart is not exposed to outside networks, and doesn’t have WAL support required for replication.
  • The supplied database must:
    • Support replication.
    • The primary database must be reachable by the primary site, and all secondary database nodes (for replication).
    • Secondary databases only need to be reachable by the secondary sites.
    • Support SSL between primary and secondary database nodes.
  • The primary site must be reachable via HTTP(S) by all secondary sites. Secondary sites must be accessible to the primary site via HTTP(S).

Overview

This guide uses 2 Omnibus GitLab database nodes, configuring only the PostgreSQL services needed, and 2 deployments of the GitLab Helm chart. It is intended to be the minimal required configuration. This documentation does not include SSL from application to database, support for other database providers, or promoting a secondary site to primary.

The outline below should be followed in order:

  1. Setup Omnibus database nodes
  2. Setup Kubernetes clusters
  3. Collect information
  4. Configure Primary database
  5. Deploy chart as Geo Primary site
  6. Set the Geo primary site
  7. Configure Secondary database
  8. Copy secrets from primary site to secondary site
  9. Deploy chart as Geo Secondary site
  10. Add Secondary Geo site via Primary
  11. Confirm Operational Status

Set up Omnibus database nodes

For this process, two nodes are required. One is the Primary database node, the other the Secondary database node. You may use any provider of machine infrastructure, on-premise or from a cloud provider.

Bear in mind that communication is required:

  • Between the two database nodes for replication.
  • Between each database node and their respective Kubernetes deployments:
    • The primary needs to expose TCP port 5432.
    • The secondary needs to expose TCP ports 5432 & 5431.

Install an operating system supported by Omnibus GitLab, and then install the Omnibus GitLab onto it. Do not provide the EXTERNAL_URL environment variable when installing, as we’ll provide a minimal configuration file before reconfiguring the package.

After you have installed the operating system, and the GitLab package, configuration can be created for the services that will be used. Before we do that, information must be collected.

Set up Kubernetes clusters

For this process, two Kubernetes clusters should be used. These can be from any provider, on-premise or from a cloud provider.

Bear in mind that communication is required:

  • To the respective database nodes:
    • Primary outbound to TCP 5432.
    • Secondary outbound to TCP 5432 and 5431.
  • Between both Kubernetes Ingress via HTTPS.

Each cluster that is provisioned should have:

Collect information

To continue with the configuration, the following information needs to be collected from the various sources. Collect these, and make notes for use through the rest of this documentation.

  • Primary database:
    • IP address
    • hostname (optional)
  • Secondary database:
    • IP address
    • hostname (optional)
  • Primary cluster:
    • External URL
    • Internal URL
    • IP addresses of nodes
  • Secondary cluster:
    • Internal URL
    • IP addresses of nodes
  • Database Passwords (must pre-decide the passwords):
    • gitlab (used in postgresql['sql_user_password'], global.psql.password)
    • gitlab_geo (used in geo_postgresql['sql_user_password'], global.geo.psql.password)
    • gitlab_replicator (needed for replication)
  • Your GitLab license file

The Internal URL of each cluster must be unique to the cluster, so that all clusters can make requests to all other clusters. For example:

  • External URL of all clusters: https://gitlab.example.com
  • Primary cluster’s Internal URL: https://london.gitlab.example.com
  • Secondary cluster’s Internal URL: https://shanghai.gitlab.example.com

This guide does not cover setting up DNS.

The gitlab and gitlab_geo database user passwords must exist in two forms: bare password, and PostgreSQL hashed password. To obtain the hashed form, perform the following commands on one of the Omnibus instances, which asks you to enter and confirm the password before outputting an appropriate hash value for you to make note of.

  1. gitlab-ctl pg-password-md5 gitlab
  2. gitlab-ctl pg-password-md5 gitlab_geo

Configure Primary database

This section is performed on the Primary Omnibus GitLab database node.

To configure the Primary database node’s Omnibus GitLab, work from this example configuration:

### Geo Primary
external_url 'http://gitlab.example.com'
roles ['geo_primary_role']
# The unique identifier for the Geo node.
gitlab_rails['geo_node_name'] = 'London Office'
gitlab_rails['auto_migrate'] = false
## turn off everything but the DB
sidekiq['enable']=false
puma['enable']=false
gitlab_workhorse['enable']=false
nginx['enable']=false
geo_logcursor['enable']=false
grafana['enable']=false
gitaly['enable']=false
redis['enable']=false
prometheus_monitoring['enable'] = false
## Configure the DB for network
postgresql['enable'] = true
postgresql['listen_address'] = '0.0.0.0'
postgresql['sql_user_password'] = 'gitlab_user_password_hash'
# !! CAUTION !!
# This list of CIDR addresses should be customized
# - primary application deployment
# - secondary database node(s)
postgresql['md5_auth_cidr_addresses'] = ['0.0.0.0/0']

We must replace several items:

  • external_url must be updated to reflect the host name of our Primary site.
  • gitlab_rails['geo_node_name'] must be replaced with a unique name for your site. See the Name field in Common settings.
  • gitlab_user_password_hash must be replaced with the hashed form of the gitlab password.
  • postgresql['md5_auth_cidr_addresses'] can be update to be a list of explicit IP addresses, or address blocks in CIDR notation.

The md5_auth_cidr_addresses should be in the form of [ '127.0.0.1/24', '10.41.0.0/16']. It is important to include 127.0.0.1 in this list, as the automation in Omnibus GitLab connects using this. The addresses in this list should include the IP address (not hostname) of your Secondary database, and all nodes of your primary Kubernetes cluster. This can be left as ['0.0.0.0/0'], however it is not best practice.

After the configuration above is prepared:

  1. Place the content into /etc/gitlab/gitlab.rb
  2. Run gitlab-ctl reconfigure. If you experience any issues in regards to the service not listening on TCP, try directly restarting it with gitlab-ctl restart postgresql.
  3. Run gitlab-ctl set-replication-password to set the password for the gitlab_replicator user.
  4. Retrieve the Primary database node’s public certificate, this is needed for the Secondary database to be able to replicate (save this output):

    cat ~gitlab-psql/data/server.crt
    

Deploy chart as Geo Primary site

This section is performed on the Primary site’s Kubernetes cluster.

To deploy this chart as a Geo Primary, start from this example configuration:

  1. Create a secret containing the database password for the chart to consume. Replace PASSWORD below with the password for the gitlab database user:

    kubectl --namespace gitlab create secret generic geo --from-literal=postgresql-password=PASSWORD
    
  2. Create a primary.yaml file based on the example configuration and update the configuration to reflect the correct values:

    ### Geo Primary
    global:
      # See docs.gitlab.com/charts/charts/globals
      # Configure host & domain
      hosts:
        domain: example.com
      # configure DB connection
      psql:
        host: geo-1.db.example.com
        port: 5432
        password:
          secret: geo
          key: postgresql-password
      # configure geo (primary)
      geo:
        nodeName: London Office
        enabled: true
        role: primary
    # External DB, disable
    postgresql:
      install: false
    
  3. Deploy the chart using this configuration:

    helm upgrade --install gitlab-geo gitlab/gitlab --namespace gitlab -f primary.yaml
    
    note
    This assumes you are using the gitlab namespace. If you want to use a different namespace, you should also replace it in --namespace gitlab throughout the rest of this document.
  4. Wait for the deployment to complete, and the application to come online. When the application is reachable, log in.

  5. Sign in to GitLab, and activate your GitLab subscription.

    note
    This step is required for Geo to function.

Set the Geo Primary site

Now that the chart has been deployed, and a license uploaded, we can configure this as the Primary site. We will do this via the Toolbox Pod.

  1. Find the Toolbox Pod

    kubectl --namespace gitlab get pods -lapp=toolbox
    
  2. Run gitlab-rake geo:set_primary_node with kubectl exec:

    kubectl --namespace gitlab exec -ti gitlab-geo-toolbox-XXX -- gitlab-rake geo:set_primary_node
    
  3. Set the primary site’s Internal URL with a Rails runner command. Replace https://primary.gitlab.example.com with the actual Internal URL:

    kubectl --namespace gitlab exec -ti gitlab-geo-toolbox-XXX -- gitlab-rails runner "GeoNode.primary_node.update!(internal_url: 'https://primary.gitlab.example.com'"
    
  4. Check the status of Geo configuration:

    kubectl --namespace gitlab exec -ti gitlab-geo-toolbox-XXX -- gitlab-rake gitlab:geo:check
    

    You should see output similar to below:

    WARNING: This version of GitLab depends on gitlab-shell 10.2.0, but you're running Unknown. Please update gitlab-shell.
    Checking Geo ...
    
    GitLab Geo is available ... yes
    GitLab Geo is enabled ... yes
    GitLab Geo secondary database is correctly configured ... not a secondary node
    Database replication enabled? ... not a secondary node
    Database replication working? ... not a secondary node
    GitLab Geo HTTP(S) connectivity ... not a secondary node
    HTTP/HTTPS repository cloning is enabled ... yes
    Machine clock is synchronized ... Exception: getaddrinfo: Servname not supported for ai_socktype
    Git user has default SSH configuration? ... yes
    OpenSSH configured to use AuthorizedKeysCommand ... no
      Reason:
      Cannot find OpenSSH configuration file at: /assets/sshd_config
      Try fixing it:
      If you are not using our official docker containers,
      make sure you have OpenSSH server installed and configured correctly on this system
      For more information see:
      doc/administration/operations/fast_ssh_key_lookup.md
    GitLab configured to disable writing to authorized_keys file ... yes
    GitLab configured to store new projects in hashed storage? ... yes
    All projects are in hashed storage? ... yes
    
    Checking Geo ... Finished
    
    • Don’t worry about Exception: getaddrinfo: Servname not supported for ai_socktype, as Kubernetes containers don’t have access to the host clock. This is OK.
    • OpenSSH configured to use AuthorizedKeysCommand ... no is expected. This Rake task is checking for a local SSH server, which is actually present in the gitlab-shell chart, deployed elsewhere, and already configured appropriately.

Configure Secondary database

This section is performed on the Secondary Omnibus GitLab database node.

To configure the Secondary database node’s Omnibus GitLab, work from this example configuration:

### Geo Secondary
# external_url must match the Primary