Upgrading operating systems for PostgreSQL

If you upgrade the operating system on which PostgreSQL runs, an upgrade of locale data changes in glibc 2.28 might corrupt your database indexes. To avoid this issue, migrate using either:

For more information about how GitLab plans to address this issue, see epic 8573.

Backup and restore

Use pg_dump or the GitLab backup tool, with all data types except db excluded (so only the database is backed up).

The restore recreates the entire database, including the indexes.

Advantages:

  • Simpler and more straightforward than replication.
  • Removes any database bloat in indexes and tables, reducing disk use.

Disadvantages:

  • Downtime is likely to be longer and particularly problematic for large databases.

Replication and failover

Set up streaming replication from the old database server to the new server.

As part of the plan to switch to the new server, reindex all affected indexes:

  • In a database console, run REINDEX INDEX <index name> CONCURRENTLY on each index.

This approach was used for GitLab.com. To learn more about this process and how the different types of indexes were handled, see the blog post about upgrading the operating system on our Postgres database clusters.

Advantages:

  • Downtime is shorter: the time to perform the necessary reindexing, plus validation.
  • Likely to be faster if database is large.

Disadvantages:

  • Technically complicated: setting up replication and ensuring all necessary reindexing is completed.
  • Preserves database bloat.

Checking glibc versions

To see what version of glibc is used, run ldd --version.

You can compare the behavior of glibc on your servers using shell commands.

The following table shows the glibc versions shipped for different operating systems:

Operating system glibc version
CentOS 7 2.17
RedHat Enterprise 8 2.28
RedHat Enterprise 9 2.34
Ubuntu 18.04 2.27
Ubuntu 20.04 2.31
Ubuntu 22.04 2.35

For example, suppose you are upgrading from CentOS 7 to RedHat Enterprise 8. In this case, using PostgreSQL on this upgraded operating system requires using one of the two mentioned approaches, because glibc is upgraded from 2.17 to 2.28. Failing to handle the collation changes properly causes significant failures in GitLab, such as runners not picking jobs with tags.

On the other hand, if PostgreSQL has already been running on glibc 2.28 or higher with no issues, your indexes should continue to work without further action. For example, if you have been running PostgreSQL on RedHat Enterprise 8 (glibc 2.28) for a while, and want to upgrade to RedHat Enterprise 9 (glibc 2.34), there should be no collations-related issues.

Verifying glibc collation versions

For PostgreSQL 13 and higher, you can verify that your database collation version matches your system with this SQL query:

SELECT collname AS COLLATION_NAME,
       collversion AS VERSION,
       pg_collation_actual_version(oid) AS actual_version
FROM pg_collation
WHERE collprovider = 'c';

Matching collation example

For example, on a Ubuntu 22.04 system, the output of a properly indexed system looks like:

gitlabhq_production=# SELECT collname AS COLLATION_NAME,
       collversion AS VERSION,
       pg_collation_actual_version(oid) AS actual_version
FROM pg_collation
WHERE collprovider = 'c';
 collation_name | version | actual_version
----------------+---------+----------------
 C              |         |
 POSIX          |         |
 ucs_basic      |         |
 C.utf8         |         |
 en_US.utf8     | 2.35    | 2.35
 en_US          | 2.35    | 2.35
(6 rows)

Mismatched collation example

On the other hand, if you’ve upgraded from Ubuntu 18.04 to 22.04 without reindexing, you might see:

gitlabhq_production=# SELECT collname AS COLLATION_NAME,
       collversion AS VERSION,
       pg_collation_actual_version(oid) AS actual_version
FROM pg_collation
WHERE collprovider = 'c';
 collation_name | version | actual_version
----------------+---------+----------------
 C              |         |
 POSIX          |         |
 ucs_basic      |         |
 C.utf8         |         |
 en_US.utf8     | 2.27    | 2.35
 en_US          | 2.27    | 2.35
(6 rows)