This page contains information related to upcoming products, features, and functionality. It is important to note that the information presented is for informational purposes only. Please do not rely on this information for purchasing or planning purposes. The development, release, and timing of any products, features, or functionality may be subject to change or delay and remain at the sole discretion of GitLab Inc.
Status Authors Coach DRIs Owning Stage Created
proposed @shinya.maeda @mikolaj_wawrzyniak @stanhu @pwietchner @oregand @tlinz devops ai-powered 2024-01-25

Elasticsearch

For more information on Elasticsearch and RAG broadly, see the Elasticsearch article in RAG at GitLab.

Retrieve GitLab Documentation

A proof of concept was done to switch the documentation embeddings from being stored in the embedding database to being stored on Elasticsearch.

Synchronizing embeddings with data source

The same procedure used by PostgreSQL can be followed to keep the embeddings up to date in Elasticsearch.

Retrieval

To get the nearest neighbours, the following query can be executed an index containing the embeddings:

{
  "knn": {
    "field": vector_field_containing_embeddings,
    "query_vector": embedding_for_question,
    "k": limit,
    "num_candidates": number_of_candidates_to_compare
  }
}

Requirements to get to self-managed

  • Productionalize the PoC MR
  • Get more self-managed instances to install Elasticsearch by shipping GitLab with Elasticsearch. Elastic gave their approval to ship with the free license. The work required for making it easy for customers to host Elasticsearch is more than 2 milestones.