- GitLab Geo database replication
Note: This is the documentation for the Omnibus GitLab packages. For installations from source, follow the database replication for installations from source guide.
- Install GitLab Enterprise Edition on the server that will serve as the secondary Geo node. Do not login or set up anything else in the secondary node for the moment.
- Setup the database replication (
primary (read-write) <-> secondary (read-only)topology).
- Configure GitLab to set the primary and secondary nodes.
- Follow the after setup steps.
This document describes the minimal steps you have to take in order to replicate your GitLab database into another server. You may have to change some values according to your database setup, how big it is, etc.
You are encouraged to first read through all the steps before executing them in your testing/production environment.
The GitLab primary node where the write operations happen will connect to
primary database server, and the secondary ones which are read-only will
secondary database servers (which are read-only too).
Note: In many databases documentation you will see
primarybeing references as
New for GitLab 9.4: We recommend using PostgreSQL replication slots to ensure the primary retains all the data necessary for the secondaries to recover. See below for more details.
The following guide assumes that:
- You are using PostgreSQL 9.2 or later which includes the
pg_basebackuptool. If you are using Omnibus it includes the required PostgreSQL version for Geo.
- You have a primary server already set up (the GitLab server you are replicating from), running Omnibus' PostgreSQL (or equivalent version), and you have a new secondary server set up on the same OS and PostgreSQL version. If you are using Omnibus, make sure the GitLab version is the same on all nodes.
- The IP of the primary server for our examples will be
18.104.22.168, whereas the secondary's IP will be
22.214.171.124. Note that the primary and secondary servers MUST be able to communicate over these addresses. These IP addresses can either be public or private.
SSH into your GitLab primary server and login as root:
Omnibus GitLab has already a replication user called
gitlab_replicator. You must set its password manually. Replace
thepasswordwith a strong password:
sudo -u gitlab-psql /opt/gitlab/embedded/bin/psql -h /var/opt/gitlab/postgresql \ -d template1 \ -c "ALTER USER gitlab_replicator WITH ENCRYPTED PASSWORD 'thepassword'"
/etc/gitlab/gitlab.rband add the following. Note that GitLab 9.1 added the
geo_primary_role['enable'] = true postgresql['listen_address'] = "126.96.36.199" postgresql['trust_auth_cidr_addresses'] = ['127.0.0.1/32','188.8.131.52/32'] postgresql['md5_auth_cidr_addresses'] = ['184.108.40.206/32'] # New for 9.4: Set this to be the number of Geo secondary nodes you have postgresql['max_replication_slots'] = 1 # postgresql['max_wal_senders'] = 10 # postgresql['wal_keep_segments'] = 10
220.127.116.11is the IP address of the primary server, and
18.104.22.168is the IP address of the secondary one.
For security reasons, PostgreSQL by default only listens on the local interface (e.g. 127.0.0.1). However, GitLab Geo needs to communicate between the primary and secondary nodes over a common network, such as a corporate LAN or the public Internet. For this reason, we need to configure PostgreSQL to listen on more interfaces.
listen_addressoption opens PostgreSQL up to external connections with the interface corresponding to the given IP. See the PostgreSQL documentation for more details.
Note that if you are running GitLab Geo with a cloud provider (e.g. Amazon Web Services), the internal interface IP (as provided by
ifconfig) may be different from the public IP address. For example, suppose you have a nodes with the following configuration:
Node Type Internal IP External IP Primary 10.1.5.3 22.214.171.124 Secondary 10.1.10.5 126.96.36.199
If you are running two nodes in different cloud availability zones, you may need to double check that the nodes can communicate over the internal IP addresses. For example, servers on Amazon Web Services in the same Virtual Private Cloud (VPC) can do this. Google Compute Engine also offers an internal network that supports cross-availability zone networking.
For the above example, the following configuration uses the internal IPs to replicate the database from the primary to the secondary:
# Example configuration using internal IPs for a cloud configuration geo_primary_role['enable'] = true postgresql['listen_address'] = "10.1.5.3" postgresql['trust_auth_cidr_addresses'] = ['127.0.0.1/32','10.1.5.3/32'] postgresql['md5_auth_cidr_addresses'] = ['10.1.10.5/32'] postgresql['max_replication_slots'] = 1 # Number of Geo secondary nodes # postgresql['max_wal_senders'] = 10 # postgresql['wal_keep_segments'] = 10
If you prefer that your nodes communicate over the public Internet, you may choose the IP addresses from the "External IP" column above.
Optional: If you want to add another secondary, the relevant setting would look like:
postgresql['md5_auth_cidr_addresses'] = ['188.8.131.52/32','184.108.40.206/32']
You may also want to edit the
max_wal_sendersto match your database replication requirements. Consult the PostgreSQL - Replication documentation for more information.
Check to make sure your firewall rules are set so that the secondary nodes can access port 5432 on the primary node.
Save the file and reconfigure GitLab for the changes to take effect.
New for 9.4: Restart your primary PostgreSQL server to ensure the replication slot changes take effect (
sudo gitlab-ctl restart postgresqlfor Omnibus-provided PostgreSQL).
Now that the PostgreSQL server is set up to accept remote connections, run
netstat -plntto make sure that PostgreSQL is listening to the server's public IP.
Continue to set up the secondary server.
SSH into your GitLab secondary server and login as root:
Test that the remote connection to the primary server works:
sudo -u gitlab-psql /opt/gitlab/embedded/bin/psql -h 220.127.116.11 -U gitlab_replicator -d gitlabhq_production -W
When prompted enter the password you set in the first step for the
gitlab_replicatoruser. If all worked correctly, you should see the database prompt.
Exit the PostgreSQL console:
Added in GitLab 9.1: Edit
/etc/gitlab/gitlab.rband add the following:
geo_secondary_role['enable'] = true geo_postgresql['enable'] = true
Reconfigure GitLab for the changes to take effect.
Continue to initiate the replication process.
Below we provide a script that connects to the primary server, replicates the database and creates the needed files for replication.
The directories used are the defaults that are set up in Omnibus. If you have changed any defaults or are using a source installation, configure it as you see fit replacing the directories and paths.
Warning: Make sure to run this on the secondary server as it removes all PostgreSQL's data before running
SSH into your GitLab secondary server and login as root:
New for 9.4: Choose a database-friendly name to use for your secondary to use as the replication slot name. For example, if your domain is
geo-secondary.mydomain.com, you may use
geo_secondary_my_domain_comas the slot name.
Execute the command below to start a backup/restore and begin the replication:
gitlab-ctl replicate-geo-database --host=18.104.22.168 --slot-name=geo-secondary_my_domain_com
--host=to the primary node IP or FQDN. You can check other possible parameters with
--help. When prompted, enter the password you set up for the
gitlab_replicatoruser in the first step.
New for 9.4: Change the
--slot-nameto the name of the replication slot to be used on the primary database. The script will attempt to create the replication slot automatically if it does not exist.
The replication process is now over.
Now that the database replication is done, the next step is to configure GitLab.
We don't support MySQL replication for GitLab Geo.
Read the troubleshooting document.