This page contains information related to upcoming products, features, and functionality. It is important to note that the information presented is for informational purposes only. Please do not rely on this information for purchasing or planning purposes. The development, release, and timing of any products, features, or functionality may be subject to change or delay and remain at the sole discretion of GitLab Inc.
Status Authors Coach DRIs Owning Stage Created
proposed @shinya.maeda @mikolaj_wawrzyniak @stanhu @pwietchner @oregand @tlinz devops ai-powered 2024-01-25

Vertex AI Search

Retrieve GitLab Documentation

  • Statistics (as of January 2024):
    • Date type: Markdown (Unstructured) written in natural language
    • Date access level: Green (No authorization required)
    • Data source: https://gitlab.com/gitlab-org/gitlab/-/blob/master/doc
    • Data size: approx. 56,000,000 bytes. 2194 pages.
    • Service: https://docs.gitlab.com/ (source repo
    • Example of user input: “How do I create an issue?”
    • Example of expected AI-generated response: “To create an issue:\n\nOn the left sidebar, select Search or go to and find your project.\n\nOn the left sidebar, select Plan > Issues, and then, in the upper-right corner, select New issue.”

The GitLab documentation is the SSoT service to serve GitLab documentation for SaaS (both GitLab.com and Dedicated) and Self-managed. When a user accesses to a documentation link in GitLab instance, they are redirected to the service since 16.0 (except air-gapped solutions). In addition, the current search backend of docs.gitlab.com needs to transition to Vertex AI Search. See this issue (GitLab member only) for more information.

We introduce a new semantic search API powered by Vertex AI Search for the documentation tool of GitLab Duo Chat.

We create a search app for each GitLab versions. These processes will likely be automated in the GitLab Documentation project by CI/CD pipelines.

  1. Create a new Bigquery table e.g. gitlab-docs-latest or gitlab-docs-v16.4
  2. Download documents from repositories (e.g. gitlab-org/gitlab/doc, gitlab-org/gitlab-runner/docs, gitlab-org/omnibus-gitlab/doc).
  3. Split them by Markdown headers and generate metadata (e.g. URL and title).
  4. Insert rows into the Bigquery table.
  5. Create a search app

See this notebook for more implementation details. The data of the latest version will be refreshed by a nightly build with Data Store API.

AI Gateway API

API design is following the existing patterns in AI Gateway.

POST /v1/search/docs
{
  "type": "search",
  "metadata": {
    "source": "GitLab EE",
    "version": "16.3"         // Used for switching search apps for older GitLab instances
  },
  "payload": {
    "query": "How can I create an issue?",
    "params": {               // Params for Vertex AI Search
      "page_size": 10,
      "filter": "",
    },
    "provider": "vertex-ai"
  }
}

The response will include the search results. For example:

{
  "response": {
    "results": [
      {
        "id": "d0454e6098773a4a4ebb613946aadd89",
        "content": "\nTo create an issue from a group:  \n1. On the left sidebar, ...",
        "metadata": {
          "Header1": "Create an issue",
          "Header2": "From a group",
          "url": "https://docs.gitlab.com/ee/user/project/issues/create_issues.html"
        }
      }
    ]
  },
  "metadata": {
    "provider": "vertex-ai"
  }
}

See SearchRequest and SearchResponse for Vertex AI API specs.

Proof of Concept

Evaluation score

Here is the evaluation scores generated by Prompt Library.

Setup correctness comprehensiveness readability evaluating_model
New (w/ Vertex AI Search) 3.7209302325581382 3.6976744186046511 3.9069767441860455 claude-2
Current (w/ Manual embeddings in GitLab-Rails and PgVector) 3.7441860465116279 3.6976744186046511 3.9767441860465116 claude-2
Dataset - Input Bigquery table: `dev-ai-research-0e2f8974.duo_chat_external.documentation__input_v1` - Output Bigquery table: - `dev-ai-research-0e2f8974.duo_chat_external_results.sm_doc_tool_vertex_ai_search` - `dev-ai-research-0e2f8974.duo_chat_external_results.sm_doc_tool_legacy` - Command: `promptlib duo-chat eval --config-file /eval/data/config/duochat_eval_config.json`

Estimated Time of Completion

  • Milestone N:
    • Setup in Vertex AI Search with CI/CD automation.
    • Introduce /v1/search/docs endpoint in AI Gateway.
    • Updates the retrieval logic in GitLab-Rails.
    • Feature flag clean up.

Total milestones: 1