ClickHouse integration guidelines

  • Tier: Free, Premium, Ultimate
  • Offering: GitLab.com, GitLab Self-Managed
  • Status: Beta on GitLab Self-Managed

For more information on plans for ClickHouse support for GitLab Self-Managed, see this epic.

ClickHouse is an open-source column-oriented database management system. It can efficiently filter, aggregate, and query across large data sets.

ClickHouse is a secondary data store for GitLab. Only specific data is stored in ClickHouse for advanced analytical features such as AI impact analytics and CI Analytics.

You can connect ClickHouse to GitLab either:

Set up ClickHouse

To set up ClickHouse with GitLab:

  1. Run ClickHouse Cluster and configure database.
  2. Configure GitLab connection to ClickHouse.
  3. Run ClickHouse migrations.

Run and configure ClickHouse

When you run ClickHouse on a hosted server, various data points might impact the resource consumption, like the number of builds that run on your instance each month, the selected hardware, the data center choice to host ClickHouse, and more. Regardless, the cost should not be significant.

To create the necessary user and database objects:

  1. Generate a secure password and save it.

  2. Sign in to the ClickHouse SQL console.

  3. Execute the following command. Replace PASSWORD_HERE with the generated password.

    SQL Copy to clipboard
    CREATE DATABASE gitlab_clickhouse_main_production;
    CREATE USER gitlab IDENTIFIED WITH sha256_password BY 'PASSWORD_HERE';
    CREATE ROLE gitlab_app;
    GRANT SELECT, INSERT, ALTER, CREATE, UPDATE, DROP, TRUNCATE, OPTIMIZE ON gitlab_clickhouse_main_production.* TO gitlab_app;
    GRANT SELECT ON information_schema.* TO gitlab_app;
    GRANT gitlab_app TO gitlab;

Configure the GitLab connection to ClickHouse

To provide GitLab with ClickHouse credentials:

  1. Edit /etc/gitlab/gitlab.rb:

    Ruby Copy to clipboard
    gitlab_rails['clickhouse_databases']['main']['database'] = 'gitlab_clickhouse_main_production'
    gitlab_rails['clickhouse_databases']['main']['url'] = 'https://example.com/path'
    gitlab_rails['clickhouse_databases']['main']['username'] = 'gitlab'
    gitlab_rails['clickhouse_databases']['main']['password'] = 'PASSWORD_HERE' # replace with the actual password
  2. Save the file and reconfigure GitLab:

    Shell Copy to clipboard
    sudo gitlab-ctl reconfigure

To verify that your connection is set up successfully:

  1. Sign in to Rails console

  2. Execute the following:

    Ruby Copy to clipboard
    ClickHouse::Client.select('SELECT 1', :main)

    If successful, the command returns [{"1"=>1}]

Run ClickHouse migrations

To create the required database objects execute:

Shell Copy to clipboard
sudo gitlab-rake gitlab:clickhouse:migrate

Enable ClickHouse for Analytics

Now that your GitLab instance is connected to ClickHouse, you can enable features to use ClickHouse by enabling ClickHouse for Analytics.