AI features based on 3rd-party integrations contribute
GitLab Duo features are powered by AI models and integrations. This document provides an overview of developing with AI features in GitLab.
For detailed instructions on setting up GitLab Duo licensing in your development environment, see GitLab Duo licensing for local development.
Instructions for setting up GitLab Duo features in the local development environment
For complete setup instructions, see GitLab Duo licensing for local development.
Required: Install AI gateway
Why: Duo features (except for Duo Workflow) route LLM requests through the AI gateway.
How: Follow these instructions to install the AI gateway with GDK. We recommend this route for most users.
You can also install AI gateway by:
We only recommend this for users who have a specific reason for not running the AI gateway through GDK.
Set up and run GDK
For detailed instructions on setting up your GDK for GitLab Duo development, see GitLab Duo licensing for local development.
Tips for local development
- When responses are taking too long to appear in the user interface, consider
restarting Sidekiq by running
gdk restart rails-background-jobs
. If that doesn’t work, trygdk kill
and thengdk start
. - Alternatively, bypass Sidekiq entirely and run the service synchronously.
This can help with debugging errors as GraphQL errors are now available in
the network inspector instead of the Sidekiq logs. To do that, temporarily alter
the
perform_for
method inLlm::CompletionWorker
class by changingperform_async
toperform_inline
.
Feature development (Abstraction Layer)
Feature flags
Apply the following feature flags to any AI feature work:
- A general flag (
ai_duo_chat_switch
) that applies to all GitLab Duo Chat features. It’s enabled by default. - A general flag (
ai_global_switch
) that applies to all other AI features. It’s enabled by default. - A flag specific to that feature. The feature flag name must be different than the licensed feature name.
See the feature flag tracker epic for the list of all feature flags and how to use them.
Push feature flags to AI gateway
You can push feature flags to AI gateway. This is helpful to gradually rollout user-facing changes even if the feature resides in AI gateway. See the following example:
# Push a feature flag state to AI gateway.
Gitlab::AiGateway.push_feature_flag(:new_prompt_template, user)
Later, you can use the feature flag state in AI gateway in the following way:
from ai_gateway.feature_flags import is_feature_enabled
# Check if the feature flag "new_prompt_template" is enabled.
if is_feature_enabled('new_prompt_template'):
# Build a prompt from the new prompt template
else:
# Build a prompt from the old prompt template
IMPORTANT: At the cleaning up step, remove the feature flag in AI gateway repository before removing the flag in GitLab-Rails repository. If you clean up the flag in GitLab-Rails repository at first, the feature flag in AI gateway will be disabled immediately as it’s the default state, hence you might encounter a surprising behavior.
IMPORTANT: Cleaning up the feature flag in AI gateway will immediately distribute the change to all GitLab instances, including GitLab.com, GitLab Self-Managed, and GitLab Dedicated.
Technical details:
When
push_feature_flag
runs on an enabled feature flag, the name of the flag is cached in the current context, which is later attached to thex-gitlab-enabled-feature-flags
HTTP header whenGitLab-Sidekiq/Rails
sends requests to AI gateway.When frontend clients (for example, VS Code Extension or LSP) request a User JWT (UJWT) for direct AI gateway communication, GitLab returns:
- Public headers (including
x-gitlab-enabled-feature-flags
). - The generated UJWT (1-hour expiration).
- Public headers (including
Frontend clients must regenerate UJWT upon expiration. Backend changes such as feature flag updates through ChatOps render the header values to become stale. These header values are refreshed at the next UJWT generation.
Similarly, we also have push_frontend_feature_flag
to push feature flags to frontend.
GraphQL API
To connect to the AI provider API using the Abstraction Layer, use an extendable
GraphQL API called aiAction
.
The input
accepts key/value pairs, where the key
is the action that needs to
be performed. We only allow one AI action per mutation request.
Example of a mutation:
mutation {
aiAction(input: {summarizeComments: {resourceId: "gid://gitlab/Issue/52"}}) {
clientMutationId
}
}
As an example, assume we want to build an “explain code” action. To do this, we extend the input
with a new key,
explainCode
. The mutation would look like this:
mutation {
aiAction(
input: {
explainCode: { resourceId: "gid://gitlab/MergeRequest/52", code: "foo() { console.log() }" }
}
) {
clientMutationId
}
}
The GraphQL API then uses the Anthropic Client to send the response.
How to receive a response
The API requests to AI providers are handled in a background job. We therefore do not keep the request alive and the Frontend needs to match the request to the response from the subscription.
Determining the right response to a request can cause problems when only userId
and resourceId
are used. For example, when two AI features use the same userId
and resourceId
both subscriptions will receive the response from each other. To prevent this interference, we introduced the clientSubscriptionId
.
To match a response on the aiCompletionResponse
subscription, you can provide a clientSubscriptionId
to the aiAction
mutation.
- The
clientSubscriptionId
should be unique per feature and within a page to not interfere with other AI features. We recommend to use aUUID
. - Only when the
clientSubscriptionId
is provided as part of theaiAction
mutation, it will be used for broadcasting theaiCompletionResponse
. - If the
clientSubscriptionId
is not provided, onlyuserId
andresourceId
are used for theaiCompletionResponse
.
As an example mutation for summarizing comments, we provide a randomId
as part of the mutation:
mutation {
aiAction(
input: {
summarizeComments: { resourceId: "gid://gitlab/Issue/52" }
clientSubscriptionId: "randomId"
}
) {
clientMutationId
}
}
In our component, we then listen on the aiCompletionResponse
using the userId
, resourceId
and clientSubscriptionId
("randomId"
):
subscription aiCompletionResponse(
$userId: UserID
$resourceId: AiModelID
$clientSubscriptionId: String
) {
aiCompletionResponse(
userId: $userId
resourceId: $resourceId
clientSubscriptionId: $clientSubscriptionId
) {
content
errors
}
}
The subscription for Chat behaves differently.
To not have many concurrent subscriptions, you should also only subscribe to the subscription once the mutation is sent by using skip()
.
Current abstraction layer flow
The following graph uses VertexAI as an example. You can use different providers.
Reuse the existing AI components for multiple models
We thrive optimizing AI components, such as prompt, input/output parser, tools/function-calling, for each LLM, however, diverging the components for each model could increase the maintenance overhead. Hence, it’s generally advised to reuse the existing components for multiple models as long as it doesn’t degrade a feature quality. Here are the rules of thumbs:
- Iterate on the existing prompt template for multiple models. Do NOT introduce a new one unless it causes a quality degradation for a particular model.
- Iterate on the existing input/output parsers and tools/functions-calling for multiple models. Do NOT introduce a new one unless it causes a quality degradation for a particular model.
- If a quality degradation is detected for a particular model, the shared component should be diverged for the particular model.
An example of this case is that we can apply Claude specific CoT optimization to the other models such as Mixtral as long as it doesn’t cause a quality degradation.
Monitoring
- Error ratio and response latency apdex for each Ai action can be found on Sidekiq Service dashboard under SLI Detail:
llm_completion
. - Spent tokens, usage of each Ai feature and other statistics can be found on periscope dashboard.
- AI gateway logs.
- AI gateway metrics.
- Feature usage dashboard via proxy.
Security
Refer to the secure coding guidelines for Artificial Intelligence (AI) features.
Help
- Here’s how to reach us!
- View guidelines for working with GitLab Duo Chat.
Docs
Edit this page to fix an error or add an improvement in a merge request.
Create an issue to suggest an improvement to this page.
Product
Create an issue if there's something you don't like about this feature.
Propose functionality by submitting a feature request.
Feature availability and product trials
View pricing to see all GitLab tiers and features, or to upgrade.
Try GitLab for free with access to all features for 30 days.
Get help
If you didn't find what you were looking for, search the docs.
If you want help with something specific and could use community support, post on the GitLab forum.
For problems setting up or using this feature (depending on your GitLab subscription).
Request support