Self-hosted model configuration and authentication
-
Introduced in GitLab 17.1 with a flag named
ai_custom_model
. Disabled by default. - Enabled on self-managed in GitLab 17.6.
- Changed to require GitLab Duo add-on in GitLab 17.6 and later.
- Feature flag
ai_custom_model
removed in GitLab 17.8
There are two configuration options for self-managed customers:
- GitLab.com AI gateway: Use the GitLab-managed AI gateway with default external large language model (LLM) providers (for example, Google Vertex or Anthropic).
- Self-hosted AI gateway: Deploy and manage your own AI gateway and language models in your infrastructure, without depending on GitLab-provided external language providers.
GitLab.com AI gateway
In this configuration, your GitLab instance depends on and sends requests to the external GitLab AI gateway, which communicates with external AI vendors such as Google Vertex or Anthropic. The response is then forwarded back to your GitLab instance.
Self-hosted AI gateway
In this configuration, the entire system is isolated within the enterprise, ensuring a fully self-hosted environment that safeguards data privacy.
For more information, see the self-hosted model deployment blueprint.
Authentication for self-hosted models
The authentication process for self-hosted models is secure, efficient, and made up of the following key components:
-
Self-issued tokens: In this architecture, access credentials are not synchronized with
cloud.gitlab.com
. Instead, tokens are self-issued dynamically, similar to the functionality on GitLab.com. This method provides users with immediate access while maintaining a high level of security. -
Offline environments: In offline setups, there are no connections to
cloud.gitlab.com
. All requests are routed exclusively to the self-hosted AI gateway. -
Token minting and verification: The GitLab self-managed instance mints the token, which is then verified by the AI gateway against the GitLab instance.
-
Model configuration and security: When an administrator configures a model, they can incorporate an API key to authenticate requests. Additionally, you can enhance security by specifying connection IP addresses within your network, ensuring that only trusted IPs can interact with the model.
As illustrated in the following diagram:
- The authentication flow begins when the user configures the model through the GitLab instance and submits a request to access the GitLab Duo feature.
- The GitLab instance mints an access token, which the user forwards to GitLab and then to the AI gateway for verification.
- Upon confirming the token’s validity, the AI gateway sends a request to the AI model, which uses the API key to authenticate the request and process it.
- The results are then relayed back to the GitLab instance, completing the flow by sending the response to the user, which is designed to be secure and efficient.