Self-hosted models supported platforms
-
Introduced in GitLab 17.1 with a flag named
ai_custom_model
. Disabled by default. - Enabled on GitLab Self-Managed in GitLab 17.6.
- Changed to require GitLab Duo add-on in GitLab 17.6 and later.
- Feature flag
ai_custom_model
removed in GitLab 17.8
There are multiple platforms available to host your self-hosted Large Language Models (LLMs). Each platform has unique features and benefits that can cater to different needs. The following documentation summarises the currently supported options:
For self-hosted model deployments
-
vLLM.
A high-performance inference server optimized for serving LLMs with memory efficiency. It supports model parallelism and integrates easily with existing workflows.
- vLLM Installation Guide. We recommend installing version v0.6.4.post1 or later.
- vLLM Supported Models
For information on available options when using vLLM to run a model, see the vLLM documentation on engine arguments.
For example, to set up and run the Mistral model, run the following command:
HF_TOKEN=HUGGING_FACE_TOKEN python -m vllm.entrypoints.openai.api_server \ --model mistralai/Mistral-7B-Instruct-v0.3 \ --served-model-name Mistral-7B-Instruct-v0.3 \ --tensor-parallel-size 8 \ --tokenizer_mode mistral \ --load_format mistral \ --config_format mistral \ --tokenizer mistralai/Mistral-7B-Instruct-v0.3
For cloud-hosted model deployments
-
AWS Bedrock. A fully managed service that allows developers to build and scale generative AI applications using pre-trained models from leading AI companies. It seamlessly integrates with other AWS services and offers a pay-as-you-go pricing model.
You must configure the GitLab instance with your appropriate AWS IAM permissions before accessing Bedrock models. You cannot do this in the self-hosted models UI. For example, you can authenticate the AI Gateway instance by defining the
AWS_ACCESS_KEY_ID
,AWS_SECRET_ACCESS_KEY
andAWS_REGION_NAME
when starting the Docker image. For more information, see the AWS Identity and Access Management (IAM) Guide. -
Azure OpenAI. Provides access to OpenAI’s powerful models, enabling developers to integrate advanced AI capabilities into their applications with robust security and scalable infrastructure.