Semantic Search

一貫した開発プロセスとドキュメントを確保するため、GitLabへのすべての貢献は英語で提出する必要があります。そのため、GitLabへの貢献に関するドキュメント（https://docs.gitlab.com/development/に掲載）も英語でのみ提供されています。

以下を希望される場合：

コードの貢献を提出する
バグの報告または修正
機能や改善の提案
ドキュメントへの貢献

これらのページの英語版のガイドラインに従ってください。

このページの英語版にアクセスしてください。

Semantic Search uses vector embeddings to find semantically similar content based on meaning rather than keyword matching, enabling AI features like Duo Chat to retrieve relevant context for user queries.

Overview

Semantic Search converts text into vector embeddings and stores them in a vector store. When a user makes a query, the query is also converted to an embedding and compared against stored vectors to find the most similar results. This approach captures semantic meaning, allowing searches to find relevant content even when exact keywords don’t match.

Semantic Search features are implemented using the Active Context framework.

Semantic Code Search is the first feature implemented with this framework. You can add support for other Semantic Search Collection types.

Architecture

Semantic Search is powered by the Active Context framework implemented through the gitlab-active-context gem. This framework provides a translation layer for different vector stores (Elasticsearch, OpenSearch, PostgreSQL with pgvector), allowing the same code to work with any supported vector store without needing vector store-specific implementations.

The framework is extensible and designed to support different types of Semantic Search Collections. Each Semantic Search Collection is implemented using:

Collections (Ai::ActiveContext::Collections::<Type>): Define what content is indexed and how it’s stored
References (Ai::ActiveContext::References::<Type>): Track and manage embeddings for content updates
Queries (Ai::ActiveContext::Queries::<Type>): Retrieve similar content from the vector store
Queues (Ai::ActiveContext::Queues::<Type>): Manage asynchronous processing of embedding generation
Migrations: Execute schema changes and data transformations on the vector store

New Collection types can be added by implementing these components for different content types (for example, merge requests or documentation).

Embedding models

For in-depth architecture details, see the Embedding Models design document.

Model metadata

Semantic Search uses embedding models to generate the embeddings used for indexing and search.

The embedding models are stored in the Ai::ActiveContext::Collection record metadata as current_indexing_embedding_model, next_indexing_embedding_model, and search_embedding_model.

Each embedding model metadata has the following information:

model_type: Either gitlab_managed or self_hosted
model_ref: The model identifier (for example, text_embedding_005_vertex or the Ai::SelfHostedModel ID)
field: The vector store field where embeddings are stored
dimensions: The embedding vector dimensions, used when creating the vector store field and for generating embeddings

`ActiveContext::EmbeddingModel` object and embeddings generation

From a Collection class (for example, Ai::ActiveContext::Collections::Code), the embedding models are exposed as ActiveContext::EmbeddingModel objects through the Ai::ActiveContext::Embeddings::ModelFactory.

Embeddings are generated by invoking generate_embeddings on the created ActiveContext::EmbeddingModel object. For example:

ac_embedding_model = Ai::ActiveContext::Collections::Code.current_indexing_embedding_model # this returns an ActiveContext::EmbeddingModel
ac_embedding_model.generate_embeddings(<array_of_content>, user: <optional_user>)

Under the hood, ActiveContext::EmbeddingModel#generate_embeddings executes a Gitlab::Llm::Embeddings::* class which sends an embeddings request to the AI Gateway.

Model switching

Model switching is implemented through the Ai::ActiveContext::EmbeddingModelActivationService, which kicks off a chain of background Active Context Tasks for setting or changing an embedding model.

Asynchronous embeddings generation

Embeddings for indexed content are generated asynchronously through a queue system using reference classes like Ai::ActiveContext::References::Code:

References are tracked: When content is created or updated, embedding references are tracked in the appropriate reference class
Batch processing: References are processed in batches by the Ai::ActiveContext::BulkProcessWorker
Vector storage: Generated embeddings are stored in the configured vector store

The Ai::ActiveContext::BulkProcessWorker is a cron job that runs every minute and processes embedding references from the queue. It fetches references, generates embeddings, and removes them from the queue. If the queue is not empty after processing, the worker re-enqueues itself to continue processing. If embedding generation fails, the references are moved to a retry queue. Items in the retry queue become visible for processing five minutes after they are pushed, so transient errors have time to clear before the retry. If the retry fails, the references are placed on a dead queue.

Query execution

When a query is executed:

Embedding generation: The user’s query is converted to an embedding using the same model as the indexed content
Vector search: The embedding is compared against stored vectors using k-nearest neighbors (KNN) search
Filtering: Results are filtered by relevant criteria (for example, project, file path)
Authorization: Results are filtered to only include content the user has access to
Result limit: By default, the 10 most similar results are returned

Migrations

The Active Context framework uses a migration system to manage schema changes and data transformations for the connected vector store. Migrations are tracked in the database and executed asynchronously by a worker process.

Ai::ActiveContext::MigrationWorker runs as a cron job every 5 minutes to execute incomplete migrations.

Tasks

For in-depth architecture details, see the Active Context Tasks design document.

The Active Context framework uses a task execution system for managing long-running asynchronous operations. The Active Context Tasks system handles complex workflows like embedding model activation, incremental indexing, and data migrations. It includes built-in support for retries, error handling, and state persistence, ensuring reliable execution of critical operations.

Vector stores

An instance can use one of the following vector stores:

Elasticsearch
OpenSearch
PostgreSQL with pgvector

A vector store connection must be created before semantic search can be used. There are two ways to configure the connection:

Option 1: Using the GitLab UI

For Elasticsearch or OpenSearch clusters used by advanced search:

Navigate to Admin > Settings > Search
Select Connect to the advanced search cluster
The connection is automatically created and configured

Option 2: Using Rails console

Use one of the following approaches. The name field is a user-defined label.

To connect with an explicit URL (for example, OpenSearch):

connection = Ai::ActiveContext::Connection.create!(
  name: "opensearch",
  options: { url: ["http://localhost:9202"] },
  adapter_class: "ActiveContext::Databases::Opensearch::Adapter"
)
connection.activate!

To reuse credentials from an existing advanced search cluster:

connection = Ai::ActiveContext::Connection.create!(
  name: "elasticsearch",
  adapter_class: "ActiveContext::Databases::Elasticsearch::Adapter",
  options: { use_advanced_search_config: true }
)
connection.activate!

For PostgreSQL, use the pgvector extension:

In the PostgreSQL database, create the extension:
```
CREATE EXTENSION vector;
```

In the Rails console, create the connection:

connection = Ai::ActiveContext::Connection.create!(
  name: "postgres",
 options: { host: 'localhost', port: 5432, user: 'postgres', password: '<password>', database: 'postgres' },
  adapter_class: "ActiveContext::Databases::Postgresql::Adapter"
)
connection.activate!

For more information, see the pgvector documentation.

Supported adapter classes:

ActiveContext::Databases::Elasticsearch::Adapter
ActiveContext::Databases::Opensearch::Adapter
ActiveContext::Databases::Postgresql::Adapter

The options hash should contain the connection details specific to your vector store (URL, credentials, and other adapter-specific settings).

Developer Guide

Managing queued items

View all queued items waiting to be processed:

ActiveContext::Queues.all_queued_items

Immediately process all queued items without waiting for cron workers:

ActiveContext.execute_all_queues!

Searching the vector store

Find all items in the vector store:

ActiveContext::adapter.search(
  user: current_user,
  collection: ::Ai::ActiveContext::Collections::Code,
  query: ActiveContext::Query.all
)

Resetting the connection

To start fresh with a new connection, destroy all existing data and recreate:

active_connection = ::Ai::ActiveContext::Connection.active
active_connection.migrations.destroy_all
active_connection.repositories.destroy_all
active_connection.enabled_namespaces.destroy_all
active_connection.collections.destroy_all
active_connection.destroy

Make sure to activate the connection:

connection.activate!

Semantic Search for Duo Self-hosted

Set up your GDK for Duo Self-hosted

Set up Self-hosted embedding models

Make sure to clear the persisted embedding model metadata so that you are not using the metadata for SaaS

::Ai::ActiveContext::Collections::Code.collection_record.update_metadata!(
  current_indexing_embedding_model: nil,
  search_embedding_model: nil,
  next_indexing_embedding_model: nil
)

You can set up the embedding models from the admin page, or alternatively set them through the EmbeddingModelActivationService on the Rails console:

::Ai::ActiveContext::EmbeddingModelActivationService.new(
  collection_class: ::Ai::ActiveContext::Collections::Code,
  model_type: 'gitlab_managed',
  model_ref: 'text_embedding_005_vertex',
  dimensions: 768,
  chunk_strategy: 'code_bytes',
  chunk_strategy_size: 1000,
  user: User.first
).execute

Adding a new embedding model to the GitLab offering

To support a new GitLab-managed embedding model, add it to the Model Selection catalog in AI Gateway.

1. Add to `models.yml`

In ai_gateway/model_selection/models.yml, add a new entry for the model, and set the following fields:

gitlab_identifier: the global model identifier used across all GitLab apps. This follows the format of <model_name>_<provider>
model_class_provider: always litellm_embedding
family: set embedding as one of the entries

For the full entry, refer to the following example:

models:
  - name: "text-embedding-005"
    provider: "Gemini Enterprise Agent Platform"
    gitlab_identifier: "text_embedding_005_vertex"
    description: "Natural language processing technique that converts textual data into numerical vectors."
    cost_indicator: "$"
    max_context_tokens: 20000 # https://docs.cloud.google.com/vertex-ai/generative-ai/docs/embeddings/get-text-embeddings#api_limits
    model_class_provider: litellm_embedding
    family:
      - vertex
      - embedding
    params:
      model: "text-embedding-005"
      custom_llm_provider: vertex_ai
    prompt_params:
      vertex_location: global

2. Add to `unit_primitives.yml`

In ai_gateway/model_selection/unit_primitives.yml, add the new model as a selectable model under the relevant feature setting.

For Semantic Code Search, add the new model under the embeddings_code feature setting.

For new Semantic Search Collections, you must add a new feature setting entry.

Troubleshooting

Semantic search returns no results

Possible causes:

Repository is not indexed yet (state is embedding_indexing_in_progress)
- Check: Ai::ActiveContext::Code::Repository.find_by(project_id: project.id).state
- Solution: Wait for indexing to complete or manually trigger processing by running ActiveContext.execute_all_queues!
Namespace is not eligible
- Check: Ai::ActiveContext::Code::EnabledNamespace.exists?(namespace_id: project.root_namespace.id)
- Solution: Verify namespace meets eligibility criteria
Vector store connection is not configured
- Check: Ai::ActiveContext::Connection.active.present?
- Solution: Configure vector store connection

Managing dead queue items

If embedding generation fails repeatedly, items are placed on the dead queue. Use the Admin API to manage them without a Rails console.

Check dead queue size

The dead queue size is visible in the Embedding Queues section of the Rake task output:

sudo gitlab-rake gitlab:semantic_search:code:info

Clear all dead queue items

To discard all dead queue items:

curl --request DELETE \
  --header "PRIVATE-TOKEN: <your_token>" \
  "https://gitlab.example.com/api/v4/admin/active_context/dead_queue"

Or through chatops:

/chatops gitlab run active_context dead_queue clear

Replay dead queue items into a processing queue

To move dead queue items back into a processing queue for another attempt:

curl --request POST \
  --header "PRIVATE-TOKEN: <your_token>" \
  --data "queue=retry_queue" \
  "https://gitlab.example.com/api/v4/admin/active_context/dead_queue/replay"

Or through chatops (recommended):

/chatops gitlab run active_context dead_queue replay --queue=retry_queue

Valid queue values are retry_queue, code, and code_backfill. Use retry_queue to attempt processing once more before failing back to the dead queue. Use code to restart the full embedding pipeline from scratch.