Troubleshooting and debugging the database

This section is to help give some copy-pasta you can use as a reference when you run into some head-banging database problems.

A first step is to search for your error in Slack, or search for GitLab <my error> with Google.

Available RAILS_ENV:

  • production (generally not for your main GDK database, but you might need this for other installations such as Omnibus).
  • development (this is your main GDK db).
  • test (used for tests like RSpec).

Delete everything and start over

If you just want to delete everything and start over with an empty DB (approximately 1 minute):

bundle exec rake db:reset RAILS_ENV=development

If you want to seed the empty DB with sample data (approximately 4 minutes):

bundle exec rake dev:setup

If you just want to delete everything and start over with sample data (approximately 4 minutes). This also does db:reset and runs DB-specific migrations:

bundle exec rake db:setup RAILS_ENV=development

If your test DB is giving you problems, it is safe to delete everything because it doesn’t contain important data:

bundle exec rake db:reset RAILS_ENV=test

Migration wrangling

  • bundle exec rake db:migrate RAILS_ENV=development: Execute any pending migrations that you might have picked up from a MR
  • bundle exec rake db:migrate:status RAILS_ENV=development: Check if all migrations are up or down
  • bundle exec rake db:migrate:down:main VERSION=20170926203418 RAILS_ENV=development: Tear down a migration
  • bundle exec rake db:migrate:up:main VERSION=20170926203418 RAILS_ENV=development: Set up a migration
  • bundle exec rake db:migrate:redo:main VERSION=20170926203418 RAILS_ENV=development: Re-run a specific migration

Replace main in the above commands to execute against the ci database instead of main.

Manually access the database

Access the database with one of these commands. They all get you to the same place.

gdk psql -d gitlabhq_development
bundle exec rails dbconsole -e development
bundle exec rails db -e development
  • \q: Quit/exit
  • \dt: List all tables
  • \d+ issues: List columns for issues table
  • CREATE TABLE board_labels();: Create a table called board_labels
  • SELECT * FROM schema_migrations WHERE version = '20170926203418';: Check if a migration was run
  • DELETE FROM schema_migrations WHERE version = '20170926203418';: Manually remove a migration

Access the database with a GUI

Most GUIs (DataGrip, RubyMine, DBeaver) require a TCP connection to the database, but by default the database runs on a UNIX socket. To be able to access the database from these tools, some steps are needed:

  1. On the GDK root directory, run:

    gdk config set postgresql.host localhost
    
  2. Open your gdk.yml, and confirm that it has the following lines:

    postgresql:
       host: localhost
    
  3. Reconfigure GDK:

    gdk reconfigure
    
  4. On your database GUI, select localhost as host, 5432 as port and gitlabhq_development as database. You can also use the connection string postgresql://localhost:5432/gitlabhq_development.

The new connection should be working now.

Access the GDK database with Visual Studio Code

Use these instructions for exploring the GitLab database while developing with the GDK:

  1. Install or open Visual Studio Code.
  2. Install the PostgreSQL VS Code Extension.
  3. In Visual Studio Code select PostgreSQL Explorer in the left toolbar.
  4. In the top bar of the new window, select + to Add Database Connection, and follow the prompts to fill in the details:
    1. Hostname: the path to the PostgreSQL folder in your GDK directory (for example /dev/gitlab-development-kit/postgresql).
    2. PostgreSQL user to authenticate as: usually your local username, unless otherwise specified during PostgreSQL installation.
    3. Password of the PostgreSQL user: the password you set when installing PostgreSQL.
    4. Port number to connect to: 5432 (default).
    5. Use an SSL connection? This depends on your installation. Options are:
      • Use Secure Connection
      • Standard Connection (default)
    6. Optional. The database to connect to: gitlabhq_development.
    7. The display name for the database connection: gitlabhq_development.

Your database connection should now be displayed in the PostgreSQL Explorer pane and you can explore the gitlabhq_development database. If you cannot connect, ensure that GDK is running. For further instructions on how to use the PostgreSQL Explorer Extension for Visual Studio Code, read the usage section of the extension documentation.

FAQ

ActiveRecord::PendingMigrationError with Spring

When running specs with the Spring pre-loader, the test database can get into a corrupted state. Trying to run the migration or dropping/resetting the test database has no effect.

$ bundle exec spring rspec some_spec.rb
...
Failure/Error: ActiveRecord::Migration.maintain_test_schema!

ActiveRecord::PendingMigrationError:


  Migrations are pending. To resolve this issue, run:

    bin/rake db:migrate RAILS_ENV=test
# ~/.rvm/gems/ruby-2.3.3/gems/activerecord-4.2.10/lib/active_record/migration.rb:392:in `check_pending!'
...
0 examples, 0 failures, 1 error occurred outside of examples

To resolve, you can kill the spring server and app that lives between spec runs.

$ ps aux | grep spring
eric             87304   1.3  2.9  3080836 482596   ??  Ss   10:12AM   4:08.36 spring app    | gitlab | started 6 hours ago | test mode
eric             37709   0.0  0.0  2518640   7524 s006  S    Wed11AM   0:00.79 spring server | gitlab | started 29 hours ago
$ kill 87304
$ kill 37709

db:migrate database version is too old to be migrated error

Users receive this error when db:migrate detects that the current schema version is older than the MIN_SCHEMA_VERSION defined in the Gitlab::Database library module.

Over time we cleanup/combine old migrations in the codebase, so it is not always possible to migrate GitLab from every previous version.

In some cases you might want to bypass this check. For example, if you were on a version of GitLab schema later than the MIN_SCHEMA_VERSION, and then rolled back the to an older migration, from before. In this case, to migrate forward again, you should set the SKIP_SCHEMA_VERSION_CHECK environment variable.

bundle exec rake db:migrate SKIP_SCHEMA_VERSION_CHECK=true

Performance issues

Reduce connection overhead with connection pooling

Creating new database connections is not free, and in PostgreSQL specifically, it requires forking an entire process to handle each new one. In case a connection lives for a very long time, this is no problem. However, forking a process for several small queries can turn out to be costly. If left unattended, peaks of new database connections can cause performance degradation, or even lead to a complete outage.

A proven solution for instances that deal with surges of small, short-lived database connections is to implement PgBouncer as a connection pooler. This pool can be used to hold thousands of connections for almost no overhead. The drawback is the addition of a small amount of latency, in exchange for up to more than 90% performance improvement, depending on the usage patterns.

PgBouncer can be fine-tuned to fit different installations. See our documentation on fine-tuning PgBouncer for more information.

Run ANALYZE to regenerate database statistics

The ANALYZE command is a good first approach for solving many performance issues. By regenerating table statistics, the query planner creates more efficient query execution paths.

Up to date statistics never hurt!

  • For Linux packages, run:

    gitlab-psql -c 'SET statement_timeout = 0; ANALYZE VERBOSE;'
    
  • On the SQL prompt, run:

    -- needed because this is likely to run longer than the default statement_timeout
    SET statement_timeout = 0;
    ANALYZE VERBOSE;
    

Collect data on ACTIVE workload

Active queries are the only ones actually consuming significant resources from the database.

This query gathers meta information from all existing active queries, along with:

  • their age
  • originating service
  • wait_event (if it’s in the waiting state)
  • other possibly relevant information:
-- long queries are usually easier to read with the fields arranged vertically
\x

SELECT
    pid
    ,datname
    ,usename
    ,application_name
    ,client_hostname
    ,backend_start
    ,query_start
    ,query
    ,age(now(), query_start) AS "age"
    ,state
    ,wait_event
    ,wait_event_type
    ,backend_type
FROM pg_stat_activity
WHERE state = 'active';

This query captures a single snapshot, so consider running the query 3-5 times in a few minutes while the environment is unresponsive:

-- redirect output to a file
-- this location must be writable by `gitlab-psql`
\o /tmp/active1304.out
--
-- now execute the query above
--
-- all output goes to the file - if the prompt is = then it ran
-- cancel writing output
\o

This Python script can help you parse the output of pg_stat_activity into numbers that are easier to understand and correlate to performance issues.

Investigate queries that seem slow

When you identify a query is taking too long to finish, or hogging too much database resources, check how the query planner is executing it with EXPLAIN:

EXPLAIN (ANALYZE, BUFFERS) SELECT ... FROM ...

BUFFERS also show approximately how much memory is involved. I/O might cause the problem, so make sure to add BUFFERS when running EXPLAIN.

If the database is sometimes performant, and sometimes slow, capture this output for the same queries while the environment is in either state.

Investigate index bloat

Index bloat shouldn’t typically cause noticeable performance problems, but it can lead to high disk usage, particularly if there are autovacuum issues.

The query below calculates bloat percentage from PostgreSQL’s own postgres_index_bloat_estimates table, and orders the results by percentage value. PostgresSQL needs some amount of bloat to run correctly, so around 25% still represents standard behavior.

select  a.identifier, a.bloat_size_bytes, b.tablename, b.ondisk_size_bytes,
    (a.bloat_size_bytes/b.ondisk_size_bytes::float)*100 as percentage
from postgres_index_bloat_estimates a
join postgres_indexes b on a.identifier=b.identifier
where
   -- to ensure the percentage calculation doesn't encounter zeroes
   a.bloat_size_bytes>0 and
   b.ondisk_size_bytes>1000000000
order by  percentage desc;

Rebuild indexes

If you identify a bloated table, you can rebuild its indexes using the query below. You should also re-run ANALYZE afterward, as statistics can be reset after indexes are rebuilt.

SET statement_timeout = 0;
REINDEX TABLE CONCURRENTLY <table_name>;

Monitor the index rebuild process by running the query below with \watch 30 added after the semicolon:

SELECT
  t.tablename, indexname, c.reltuples AS num_rows,
  pg_size_pretty(pg_relation_size(quote_ident(t.tablename)::text)) AS table_size,
  pg_size_pretty(pg_relation_size(quote_ident(indexrelname)::text)) AS index_size,
CASE WHEN indisvalid THEN 'Y'
  ELSE 'N'
END AS VALID
FROM pg_tables t
LEFT OUTER JOIN pg_class c ON t.tablename=c.relname
LEFT OUTER JOIN
  ( SELECT c.relname AS ctablename, ipg.relname AS indexname, x.indnatts AS
  number_of_columns, indexrelname, indisvalid FROM pg_index x
JOIN pg_class c ON c.oid = x.indrelid
JOIN pg_class ipg ON ipg.oid = x.indexrelid
JOIN pg_stat_all_indexes psai ON x.indexrelid = psai.indexrelid )
AS foo
ON t.tablename = foo.ctablename
WHERE
  t.tablename in ('<comma_separated_table_names>')
  ORDER BY 1,2; \watch 30