- Step 1: Install LLM serving infrastructure
- Step 2: Install the GitLab AI Gateway
- Upgrade the AI Gateway
- Alternative installation methods
- Troubleshooting
Set up your self-hosted model deployment infrastructure
-
Introduced in GitLab 17.1 with a flag named
ai_custom_model
. Disabled by default.
By self-hosting the model, AI Gateway, and GitLab instance, there are no calls to external architecture, ensuring maximum levels of security.
To set up your self-hosted model deployment infrastructure:
- Install the large language model (LLM) serving infrastructure.
- Install the GitLab AI Gateway.
Step 1: Install LLM serving infrastructure
Install one of the following GitLab-approved LLM models:
Model | Code Completion | Code Generation | Duo Chat |
---|---|---|---|
CodeGemma 2b | ✅ | - | - |
CodeGemma 7b-it | - | ✅ | - |
CodeGemma 7b | ✅ | - | - |
Code-Llama 13b-code | ✅ | - | - |
Code-Llama 13b | - | ✅ | - |
Codestral 22B (see setup instructions) | ✅ | ✅ | - |
Mistral 7B | - | ✅ | ✅ |
Mixtral 8x22B | - | ✅ | ✅ |
Mixtral 8x7B | - | ✅ | ✅ |
Recommended serving architectures
For Mistral, you should use one of the following architectures:
Step 2: Install the GitLab AI Gateway
Install by using Docker
Prerequisites:
- You must install Docker.
- Use a valid hostname accessible within your network. Do not use
localhost
.
The GitLab AI Gateway Docker image contains all necessary code and dependencies in a single container.
Find the GitLab official Docker image at:
Set up the volumes location
Create a directory where the logs will reside on the Docker host. It can be under your user’s home directory (for example
~/gitlab-agw
), or in a directory like /srv/gitlab-agw
. To create that directory, run:
sudo mkdir -p /srv/gitlab-agw
If you’re running Docker with a user other than root
, ensure appropriate
permissions have been granted to that directory.
Optional: Download documentation index
To improve results when asking GitLab Duo Chat questions about GitLab, you can index GitLab documentation and provide it as a file to the AI Gateway.
To index the documentation in your local installation,run:
pip install requests langchain langchain_text_splitters
python3 scripts/custom_models/create_index.py -o <path_to_created_index/docs.db>
This creates a file docs.db
at the specified path.
You can also create an index for a specific GitLab version:
python3 scripts/custom_models/create_index.py --version_tag="{gitlab-version}"
Find the AI Gateway release
In a production environment, you should set your deployment to a specific
GitLab AI Gateway. In the AI Gateway container registry, find the image that corresponds to the version of your GitLab instance. For example, if your GitLab instance
has version 17.2.0-ee
, then use registry.gitlab.com/gitlab-org/modelops/applied-ml/code-suggestions/ai-assist/model-gateway:gitlab-v17.2.0
.
docker run -e AIGW_CUSTOM_MODELS__ENABLED=true \
-v path/to/created/index/docs.db:/app/tmp/docs.db \
-e AIGW_FASTAPI__OPENAPI_URL="/openapi.json" \
-e AIGW_AUTH__BYPASS_EXTERNAL=true \
-e AIGW_FASTAPI__DOCS_URL="/docs"\
-e AIGW_FASTAPI__API_PORT=5052 \
<image>
The arguments AIGW_FASTAPI__OPENAPI_URL
and AIGW_FASTAPI__DOCS_URL
are not
mandatory, but are useful for debugging. From the host, accessing http://localhost:5052/docs
should open the AI Gateway API documentation.
Install by using Docker Engine
-
For the AI Gateway to access the API, it must know where the GitLab instance is located. To do this, set the environment variables
AIGW_GITLAB_URL
andAIGW_GITLAB_API_URL
:AIGW_GITLAB_URL=https://<your_gitlab_domain> AIGW_GITLAB_API_URL=https://<your_gitlab_domain>/api/v4/
-
For the GitLab instance to know where AI Gateway is located so it can access the gateway, set the environment variable
AI_GATEWAY_URL
inside your GitLab instance environment variables:AI_GATEWAY_URL=https://<your_ai_gitlab_domain> CLOUD_CONNECTOR_SELF_SIGN_TOKENS=1
-
After you’ve set up the environment variables, run the image. For example:
docker run -p 5000:500 -e AIGW_CUSTOM_MODELS__ENABLED=true registry.gitlab.com/gitlab-org/modelops/applied-ml/code-suggestions/ai-assist/model-gateway:latest
This command downloads and starts a AI Gateway container, and publishes ports needed to access SSH, HTTP and HTTPS.
-
Track the initialization process:
sudo docker logs -f gitlab-aigw
After starting the container, visit gitlab-aigw.example.com
. It might take
a while before the Docker container starts to respond to queries.
Upgrade the AI Gateway
To upgrade the AI Gateway, download the newest Docker image tag.
-
Stop the running container:
sudo docker stop gitlab-aigw
-
Remove the existing container:
sudo docker rm gitlab-aigw
-
Pull the new image:
docker run -e AIGW_CUSTOM_MODELS__ENABLED=true \ -v path/to/created/index/docs.db:/app/tmp/docs.db \ -e AIGW_FASTAPI__OPENAPI_URL="/openapi.json" \ -e AIGW_AUTH__BYPASS_EXTERNAL=true \ -e AIGW_FASTAPI__DOCS_URL="/docs"\ -e AIGW_FASTAPI__API_PORT=5052 \ registry.gitlab.com/gitlab-org/modelops/applied-ml/code-suggestions/ai-assist/model-gateway:v1.11.0`
-
Ensure that the environment variables are all set correctly
Alternative installation methods
For information on alternative ways to install the AI Gateway, see issue 463773.
Troubleshooting
The image’s platform does not match the host
When finding the AI Gateway release, you might get an error that states The requested image’s platform (linux/amd64) does not match the detected host
.
To work around this error, add --platform linux/amd64
to the docker run
command:
docker run -e AIGW_CUSTOM_MODELS__ENABLED=true --platform linux/amd64 \
-v path/to/created/index/docs.db:/app/tmp/docs.db \
-e AIGW_FASTAPI__OPENAPI_URL="/openapi.json" \
-e AIGW_AUTH__BYPASS_EXTERNAL=true \
-e AIGW_FASTAPI__DOCS_URL="/docs"\
-e AIGW_FASTAPI__API_PORT=5052 \
registry.gitlab.com/gitlab-org/modelops/applied-ml/code-suggestions/ai-assist/model-gateway:v1.11.0`