Set up your self-hosted model deployment infrastructure

Tier: For a limited time, Premium and Ultimate. In the future, GitLab Duo Enterprise. Offering: Self-managed Status: Experiment
History
The availability of this feature is controlled by a feature flag. For more information, see the history.
caution
This feature is considered experimental and is not intended for customer usage outside of initial design partners. We expect major changes to this feature.
This page contains information related to upcoming products, features, and functionality. It is important to note that the information presented is for informational purposes only. Please do not rely on this information for purchasing or planning purposes. The development, release, and timing of any products, features, or functionality may be subject to change or delay and remain at the sole discretion of GitLab Inc.

By self-hosting the model, AI Gateway, and GitLab instance, there are no calls to external architecture, ensuring maximum levels of security.

To set up your self-hosted model deployment infrastructure:

  1. Install the large language model (LLM) serving infrastructure.
  2. Install the GitLab AI Gateway.

Step 1: Install LLM serving infrastructure

Install one of the following GitLab-approved LLM models:

For Mistral, you should use one of the following architectures:

Step 2: Install the GitLab AI Gateway

Install by using Docker

Prerequisites:

  • You must install Docker.
  • Use a valid hostname accessible within your network. Do not use localhost.

The GitLab AI Gateway Docker image contains all necessary code and dependencies in a single container.

Find the GitLab official Docker image at:

caution
Docker for Windows is not officially supported. There are known issues with volume permissions, and potentially other unknown issues. If you are trying to run on Docker for Windows, see the getting help page for links to community resources (such as IRC or forums) to seek help from other users.

Set up the volumes location

Create a directory where the logs will reside on the Docker host. It can be under your user’s home directory (for example ~/gitlab-agw), or in a directory like /srv/gitlab-agw. To create that directory, run:

sudo mkdir -p /srv/gitlab-agw

If you’re running Docker with a user other than root, ensure appropriate permissions have been granted to that directory.

Find the AI Gateway release

In a production environment, you should set your deployment to a specific GitLab AI Gateway release. Find the release to use in GitLab AI Gateway releases, for example:

docker run -p 5000:500 -e AIGW_CUSTOM_MODELS__ENABLED=true registry.gitlab.com/gitlab-org/modelops/applied-ml/code-suggestions/ai-assist/model-gateway:v1.4.0`

To set your deployment to the latest stable release, use the latest tag to run the latest stable release:

docker run -p 5000:500 -e AIGW_CUSTOM_MODELS__ENABLED=true registry.gitlab.com/gitlab-org/modelops/applied-ml/code-suggestions/ai-assist/model-gateway:latest`
note
We do not yet support multi-arch image, only linux/amd64. If you try to run this on Apple chip, adding --platform linux/amd64 to the docker run command will help.

Install by using Docker Engine

  1. For the AI Gateway to access the API, it must know where the GitLab instance is located. To do this, set the environment variables AIGW_GITLAB_URL and AIGW_GITLAB_API_URL:

    AIGW_GITLAB_URL=https://<your_gitlab_domain>
    AIGW_GITLAB_API_URL=https://<your_gitlab_domain>/api/v4/
    
  2. For the GitLab instance to know where AI Gateway is located so it can access the gateway, set the environment variable AI_GATEWAY_URL inside your GitLab instance environment variables:

    AI_GATEWAY_URL=https://<your_ai_gitlab_domain>
    
  3. After you’ve set up the environment variables, run the image. For example:

    docker run -p 5000:500 -e AIGW_CUSTOM_MODELS__ENABLED=true registry.gitlab.com/gitlab-org/modelops/applied-ml/code-suggestions/ai-assist/model-gateway:latest
    

    This command downloads and starts a AI Gateway container, and publishes ports needed to access SSH, HTTP and HTTPS.

  4. Track the initialization process:

    sudo docker logs -f gitlab-aigw
    

After starting the container, visit gitlab-aigw.example.com. It might take a while before the Docker container starts to respond to queries.

Upgrade the AI Gateway

To upgrade the AI Gateway, download the newest Docker image tag.

  1. Stop the running container:

    sudo docker stop gitlab-aigw
    
  2. Remove the existing container:

    sudo docker rm gitlab-aigw
    
  3. Pull the new image:

    docker run -p 5000:500 -e AIGW_CUSTOM_MODELS__ENABLED=true registry.gitlab.com/gitlab-org/modelops/applied-ml/code-suggestions/ai-assist/model-gateway:latest
    
  4. Ensure that the environment variables are all set correctly

Alternative installation methods

For information on alternative ways to install the AI Gateway, see issue 463773.