Zoekt

  • Tier: Premium, Ultimate
  • Offering: GitLab Self-Managed
  • Status: Beta
History

This feature is in beta and subject to change without notice. For more information, see epic 9404.

Zoekt is an open-source search engine designed specifically to search for code.

With this integration, you can use exact code search instead of advanced search to search for code in GitLab. You can use exact match and regular expression modes to search for code in a group or repository.

Install Zoekt

Prerequisites:

  • You must have administrator access to the instance.

To enable exact code search in GitLab, you must have at least one Zoekt node connected to the instance. The following installation methods are supported for Zoekt:

Prerequisites:

  • You must have administrator access to the instance.
  • You must install Zoekt.

To enable exact code search in GitLab:

  1. On the left sidebar, at the bottom, select Admin.
  2. Select Settings > Search.
  3. Expand Exact code search configuration.
  4. Select the Enable indexing and Enable searching checkboxes.
  5. Select Save changes.

Check indexing status

Prerequisites:

  • You must have administrator access to the instance.

Indexing performance depends on the CPU and memory limits on the Zoekt indexer nodes. To check indexing status:

Run this Rake task:

Shell Copy to clipboard
gitlab-rake gitlab:zoekt:info

To have the data refresh automatically every 10 seconds, run this task instead:

Shell Copy to clipboard
gitlab-rake "gitlab:zoekt:info[10]"

Delete offline nodes automatically

Prerequisites:

  • You must have administrator access to the instance.

You can automatically delete Zoekt nodes that are offline for more than 12 hours and their related indices, repositories, and tasks.

To delete offline nodes automatically:

  1. On the left sidebar, at the bottom, select Admin.
  2. Select Settings > Search.
  3. Expand Exact code search configuration.
  4. Select the Delete offline nodes after 12 hours checkbox.
  5. Select Save changes.

Index root namespaces automatically

Prerequisites:

  • You must have administrator access to the instance.

You can index both existing and new root namespaces automatically. To index all root namespaces automatically:

  1. On the left sidebar, at the bottom, select Admin.
  2. Select Settings > Search.
  3. Expand Exact code search configuration.
  4. Select the Index root namespaces automatically checkbox.
  5. Select Save changes.

When you enable this setting, GitLab creates indexing tasks for all projects in:

  • All groups and subgroups
  • Any new root namespace

After a project is indexed, GitLab creates only incremental indexing when a repository change is detected.

When you disable this setting:

  • Existing root namespaces remain indexed.
  • New root namespaces are no longer indexed.

Pause indexing

Prerequisites:

  • You must have administrator access to the instance.

To pause indexing for exact code search:

  1. On the left sidebar, at the bottom, select Admin.
  2. Select Settings > Search.
  3. Expand Exact code search configuration.
  4. Select the Pause indexing checkbox.
  5. Select Save changes.

When you pause indexing for exact code search, all changes in your repository are queued. To resume indexing, clear the Pause indexing for exact code search checkbox.

Set concurrent indexing tasks

Prerequisites:

  • You must have administrator access to the instance.

You can set the number of concurrent indexing tasks for a Zoekt node relative to its CPU capacity.

A higher multiplier means more tasks can run concurrently, which would improve indexing throughput at the cost of increased CPU usage. The default value is 1.0 (one task per CPU core).

You can adjust this value based on the node’s performance and workload. To set the number of concurrent indexing tasks:

  1. On the left sidebar, at the bottom, select Admin.

  2. Select Settings > Search.

  3. Expand Exact code search configuration.

  4. In the Indexing CPU to tasks multiplier text box, enter a value.

    For example, if a Zoekt node has 4 CPU cores and the multiplier is 1.5, the number of concurrent tasks for the node is 6.

  5. Select Save changes.

Run Zoekt on a separate server

Prerequisites:

  • You must have administrator access to the instance.

To run Zoekt on a different server than GitLab:

  1. Change the Gitaly listening interface.
  2. Install Zoekt.

Zoekt does not support any authentication, so ensure:

  • The zoekt instance is not publicly accessible.
  • Only the GitLab server has access to the Zoekt server through firewall policies or IP rules.

Troubleshooting

When working with Zoekt, you might encounter the following issues.

Namespace is not indexed

When you enable the setting, new namespaces get indexed automatically. If a namespace is not indexed automatically, inspect the Sidekiq logs to see if the jobs are being processed. Search::Zoekt::SchedulingWorker is responsible for indexing namespaces.

In a Rails console session, you can check:

  • Namespaces where Zoekt is not enabled:

    Ruby Copy to clipboard
    Namespace.group_namespaces.root_namespaces_without_zoekt_enabled_namespace
  • The status of Zoekt indices:

    Ruby Copy to clipboard
    Search::Zoekt::Index.all.pluck(:state, :namespace_id)

To index a namespace manually, run this command:

Ruby Copy to clipboard
namespace = Namespace.find_by_full_path('<top-level-group-to-index>')
Search::Zoekt::EnabledNamespace.find_or_create_by(namespace: namespace)

Error: SilentModeBlockedError

You might get a SilentModeBlockedError when you try to run exact code search. This issue occurs when Silent Mode is enabled on the GitLab instance.

To resolve this issue, ensure Silent Mode is disabled.

Error: connections to all backends failing

In application_json.log, you might get the following error:

Copy to clipboard
connections to all backends failing; last error: UNKNOWN: ipv4:1.2.3.4:5678: Trying to connect an http1.x server

To resolve this issue, check if you’re using any proxies. If you are, set the IP address of the GitLab server to no_proxy:

Ruby Copy to clipboard
gitlab_rails['env'] = {
  "http_proxy" => "http://proxy.domain.com:1234",
  "https_proxy" => "http://proxy.domain.com:1234",
  "no_proxy" => ".domain.com,IP_OF_GITLAB_INSTANCE,127.0.0.1,localhost"
}

proxy.domain.com:1234 is the domain of the proxy instance and the port. IP_OF_GITLAB_INSTANCE points to the public IP address of the GitLab instance.

You can get this information by running ip a and checking one of the following:

  • The IP address of the appropriate network interface
  • The public IP address of any load balancer you’re using

Verify Zoekt node connections

To verify that your Zoekt nodes are properly configured and connected, in a Rails console session:

  • Check the total number of configured Zoekt nodes:

    Ruby Copy to clipboard
    Search::Zoekt::Node.count
  • Check how many nodes are online:

    Ruby Copy to clipboard
    Search::Zoekt::Node.online.count

Alternatively, you can use the gitlab:zoekt:info Rake task.

If the number of online nodes is lower than the number of configured nodes or is zero when nodes are configured, you might have connectivity issues between GitLab and your Zoekt nodes.