Set up your self-hosted model infrastructure

Tier: For a limited time, Ultimate. On October 17, 2024, Ultimate with GitLab Duo Enterprise. Offering: Self-managed Status: Beta
History
The availability of this feature is controlled by a feature flag. For more information, see the history.

By self-hosting the model, AI Gateway, and GitLab instance, there are no calls to external architecture, ensuring maximum levels of security.

To set up your self-hosted model infrastructure:

  1. Install the large language model (LLM) serving infrastructure.
  2. Configure your GitLab instance.
  3. Install the GitLab AI Gateway.

For an installation video guide, see Self-Hosted Models Deployment.

For an installation video guide in French, see Self-Hosted Models Deployment (French Language version).

Install large language model serving infrastructure

Install one of the following GitLab-approved LLM models:

Model Code completion Code generation GitLab Duo Chat
CodeGemma 2b Yes No No
CodeGemma 7b-it (Instruction) No Yes No
CodeGemma 7b-code (Code) Yes No No
Code-Llama 13b-code Yes No No
Code-Llama 13b No Yes No
Codestral 22B (see setup instructions) Yes Yes No
Mistral 7B No Yes Yes
Mixtral 8x22B No Yes Yes
Mixtral 8x7B No Yes Yes
DeepSeek Coder 33b Instruct Yes Yes No
DeepSeek Coder 33b Base Yes No No

Use a serving architecture

To host your models, you should use:

  • For non-cloud on-premise deployments, vLLM.
  • For cloud deployments, AWS Bedrock as a cloud provider.

Configure your GitLab instance

Prerequisites:

  • Upgrade to the latest version of GitLab.
  1. The GitLab instance must be able to access the AI Gateway.

    1. Where your GitLab instance is installed, update the /etc/gitlab/gitlab.rb file.

      sudo vim /etc/gitlab/gitlab.rb
      
    2. Add and save the following environment variables.

      gitlab_rails['env'] = {
      'GITLAB_LICENSE_MODE' => 'production',
      'CUSTOMER_PORTAL_URL' => 'https://customers.gitlab.com',
      'AI_GATEWAY_URL' => '<path_to_their_ai_gateway>'
      }
    3. Run reconfigure:

      sudo gitlab-ctl reconfigure
      
  2. Start a GitLab Rails console:

    sudo gitlab-rails console
    

    In the console, enable feature flags:

    Feature.enable(:self_hosted_models_beta_ended)
    Feature.enable(:ai_custom_model)
    

    Exit the Rails console.

Install the GitLab AI Gateway

Install by using Docker

Prerequisites:

  • Install a Docker container engine, such as Docker.
  • Use a valid hostname accessible within your network. Do not use localhost.

The GitLab AI Gateway Docker image contains all necessary code and dependencies in a single container.

Find the AI Gateway release

Find the GitLab official Docker image at:

Use the image tag that corresponds to your GitLab version. For example, if the GitLab version is v17.5.0, use self-hosted-v17.5.0-ee tag.

Set up the volumes location

Create a directory where the logs will reside on the Docker host. It can be under your user’s home directory (for example ~/gitlab-agw), or in a directory like /srv/gitlab-agw. To create that directory, run:

sudo mkdir -p /srv/gitlab-agw

If you’re running Docker with a user other than root, ensure appropriate permissions have been granted to that directory.

Start a container from the image

For Docker images with version self-hosted-17.4.0-ee and later, run the following:

docker run -e AIGW_GITLAB_URL=<your_gitlab_instance> \
 -e AIGW_GITLAB_API_URL=https://<your_gitlab_domain>/api/v4/ \
 <image>

From the container host, accessing http://localhost:5052/docs should open the AI Gateway API documentation.

Install by using the AI Gateway Helm chart

Prerequisites:

  • You must have a:
    • Domain you own, that you can add a DNS record to.
    • Kubernetes cluster.
    • Working installation of kubectl.
    • Working installation of Helm, version v3.11.0 or later.

For more information, see Test the GitLab chart on GKE or EKS.

Add the AI Gateway Helm repository

Add the AI Gateway Helm repository to Helm’s configuration:

helm repo add ai-gateway \
https://gitlab.com/api/v4/projects/gitlab-org%2fcharts%2fai-gateway-helm-chart/packages/helm/devel

Install the AI Gateway

  1. Create the ai-gateway namespace:

    kubectl create namespace ai-gateway
    
  2. Generate the certificate for the domain where you plan to expose the AI Gateway.
  3. Create the TLS secret in the previously created namespace:

    kubectl -n ai-gateway create secret tls ai-gateway-tls --cert="<path_to_cert>" --key="<path_to_cert_key>"
    
  4. For the AI Gateway to access the API, it must know where the GitLab instance is located. To do this, set the gitlab.url and gitlab.apiUrl together with the ingress.hosts and ingress.tls values as follows:

    helm repo add ai-gateway \
      https://gitlab.com/api/v4/projects/gitlab-org%2fcharts%2fai-gateway-helm-chart/packages/helm/devel
    helm repo update
    
    helm upgrade --install ai-gateway \
      ai-gateway/ai-gateway \
      --version 0.1.1 \
      --namespace=ai-gateway \
      --set="image.tag=<ai-gateway-image>" \
      --set="gitlab.url=https://<your_gitlab_domain>" \
      --set="gitlab.apiUrl=https://<your_gitlab_domain>/api/v4/" \
      --set "ingress.enabled=true" \
      --set "ingress.hosts[0].host=<your_gateway_domain>" \
      --set "ingress.hosts[0].paths[0].path=/" \
      --set "ingress.hosts[0].paths[0].pathType=ImplementationSpecific" \
      --set "ingress.tls[0].secretName=ai-gateway-tls" \
      --set "ingress.tls[0].hosts[0]=<your_gateway_domain>" \
      --set="ingress.className=nginx" \
      --timeout=300s --wait --wait-for-jobs
    

This step can take will take a few seconds in order for all resources to be allocated and the AI Gateway to start.

Wait for your pods to get up and running:

kubectl wait pod \
  --all \
  --for=condition=Ready \
  --namespace=ai-gateway \
  --timeout=300s

When your pods are up and running, you can set up your IP ingresses and DNS records.

Configure the GitLab instance

Configure the GitLab instance.

With those steps completed, your Helm chart installation is complete.

Upgrade the AI Gateway Docker image

To upgrade the AI Gateway, download the newest Docker image tag.

  1. Stop the running container:

    sudo docker stop gitlab-aigw
    
  2. Remove the existing container:

    sudo docker rm gitlab-aigw
    
  3. Pull and run the new image.

  4. Ensure that the environment variables are all set correctly.

Alternative installation methods

For information on alternative ways to install the AI Gateway, see issue 463773.

Troubleshooting

First, run the debugging scripts to verify your self-hosted model setup.

For more information on other actions to take, see the troubleshooting documentation.