Supported self-hosted models and hardware requirements
Tier: Ultimate with GitLab Duo Enterprise - Start a trial
Offering: Self-managed
Status: Beta
History
-
Introduced in GitLab 17.1 with a flag named
ai_custom_model
. Disabled by default. - Enabled on self-managed in GitLab 17.6.
- GitLab Duo add-on required in GitLab 17.6 and later.
The following table shows the supported models along with their specific features and hardware requirements to help you select the model that best fits your infrastructure needs for optimal performance.
Approved LLMs
Install one of the following GitLab-approved LLM models:
Model family | Model | Code completion | Code generation | GitLab Duo Chat |
---|---|---|---|---|
Mistral | Codestral 22B | Yes | Yes | No |
Mistral | Mistral 7B | Yes | No | No |
Mistral | Mistral 7B-it | Yes | Yes | Yes |
Mistral | Mixtral 8x7B | Yes | No | No |
Mistral | Mixtral 8x7B-it | Yes | Yes | Yes |
Mistral | Mixtral 8x22B | Yes | No | No |
Mistral | Mixtral 8x22B-it | Yes | Yes | Yes |
Claude 3 | Claude 3.5 Sonnet | Yes | Yes | Yes |
The following models are under evaluation, and support is limited:
Model family | Model | Code completion | Code generation | GitLab Duo Chat |
---|---|---|---|---|
CodeGemma | CodeGemma 2b | Yes | No | No |
CodeGemma | CodeGemma 7b-it (Instruction) | No | Yes | No |
CodeGemma | CodeGemma 7b-code (Code) | Yes | No | No |
CodeLlama | Code-Llama 13b-code | Yes | No | No |
CodeLlama | Code-Llama 13b | No | Yes | No |
DeepSeekCoder | DeepSeek Coder 33b Instruct | Yes | Yes | No |
DeepSeekCoder | DeepSeek Coder 33b Base | Yes | No | No |
GPT | GPT-3.5-Turbo | Yes | Yes | No |
GPT | GPT-4 | Yes | Yes | No |
GPT | GPT-4 Turbo | Yes | Yes | No |
GPT | GPT-4o | Yes | Yes | No |
GPT | GPT-4o-mini | Yes | Yes | No |
Hardware Requirements
For optimal performance, the following hardware specifications are recommended for hosting these models:
- CPU: Minimum 8 cores (16 threads recommended).
- RAM: At least 32 GB (64 GB or more recommended for larger models).
- GPU: NVIDIA A100 or equivalent for optimal inference performance.
- Storage: SSD with sufficient space for model weights and data.