Use debugging scripts
Check if GitLab can make a request to the model
Check if a user can request Code Suggestions
Check if GitLab instance is configured to use self-hosted-models
Check that GitLab environmental variables are set up correctly
Check if GitLab can make an HTTP request to the AI gateway
Check if the AI gateway can make a request to the model
Check if AI gateway can process requests
Check that the AI gateway environmental variables are set up correctly
Check if the model is reachable from AI gateway
The image’s platform does not match the host
LLM server is not available inside the AI gateway container
vLLM 404 Error
Code Suggestions access error
Verify GitLab setup
No logs generated in the AI gateway server
SSL certificate errors and key de-serialization issues in the AI gateway Container
Troubleshooting common Duo Chat errors
Related topics

Troubleshooting GitLab Duo Self-Hosted Models

Tier: Ultimate with GitLab Duo Enterprise - Start a trial Offering: GitLab Self-Managed Status: Beta

History

Introduced in GitLab 17.1 with a flag named ai_custom_model. Disabled by default.
Enabled on GitLab Self-Managed in GitLab 17.6.
Changed to require GitLab Duo add-on in GitLab 17.6 and later.
Feature flag ai_custom_model removed in GitLab 17.8

When working with GitLab Duo Self-Hosted Models, you might encounter issues.

Before you begin troubleshooting, you should:

Be able to access open the gitlab-rails console.
Open a shell in the AI gateway Docker image.
Know the endpoint where your:
- AI gateway is hosted.
- Model is hosted.
Enable the feature flag expanded_ai_logging on the gitlab-rails console:
```
Feature.enable(:expanded_ai_logging)
```
Now, requests and responses from GitLab to the AI gateway are logged to llm.log

Use debugging scripts

We provide two debugging scripts to help administrators verify their self-hosted model configuration.

Debug the GitLab to AI gateway connection. From your GitLab instance, run the Rake task:
```
gitlab-rake "gitlab:duo:verify_self_hosted_setup[<username>]"
```
Optional: Include a <username> that has an assigned seat. If you do not include a username parameter, the Rake task uses the root user.

Debug the AI gateway setup. For your AI gateway container, run:

docker exec -it <ai-gateway-container> sh
poetry run troubleshoot --model-name "mistral" --model-endpoint
"http://localhost:4000"

Verify the output of the commands, and fix accordingly.

If both commands are successful, but GitLab Duo Code Suggestions is still not working, raise an issue on the issue tracker.

Check if GitLab can make a request to the model

From the GitLab Rails console, verify that GitLab can make a request to the model by running:

model_name = "<your_model_name>"
model_endpoint = "<your_model_endpoint>"
model_api_key = "<your_model_api_key>"
body = {:prompt_components=>[{:type=>"prompt", :metadata=>{:source=>"GitLab EE", :version=>"17.3.0"}, :payload=>{:content=>[{:role=>:user, :content=>"Hello"}], :provider=>:litellm, :model=>model_name, :model_endpoint=>model_endpoint, :model_api_key=>model_api_key}}]}
ai_gateway_url = Gitlab::AiGateway.url # Verify that it's not nil
client = Gitlab::Llm::AiGateway::Client.new(User.find_by_id(1), service_name: :self_hosted_models)
client.complete(url: "#{ai_gateway_url}/v1/chat/agent", body: body)

This should return a response from the model in the format:

{"response"=> "<Model response>",
 "metadata"=>
  {"provider"=>"litellm",
   "model"=>"<>",
   "timestamp"=>1723448920}}

If that is not the case, this might means one of the following:

The user might not have access to Code Suggestions. To resolve, check if a user can request Code Suggestions.
The GitLab environment variables are not configured correctly. To resolve, check that the GitLab environmental variables are set up correctly.
The GitLab instance is not configured to use self-hosted models. To resolve, check if the GitLab instance is configured to use self-hosted models.
The AI gateway is not reachable. To resolve, check if GitLab can make an HTTP request to the AI gateway.
When the LLM server is installed on the same instance as the AI gateway container, local requests may not work. To resolve, allow local requests from the Docker container.

Check if a user can request Code Suggestions

In the GitLab Rails console, check if a user can request Code Suggestions by running:

User.find_by_id("<user_id>").can?(:access_code_suggestions)

If this returns false, it means some configuration is missing, and the user cannot access Code Suggestions.

This missing configuration might be because of either of the following:

The license is not valid. To resolve, check or update your license.
GitLab Duo was not configured to use a self-hosted model. To resolve, check if the GitLab instance is configured to use self-hosted models.

Check if GitLab instance is configured to use self-hosted-models

To check if GitLab Duo was configured correctly:

On the left sidebar, at the bottom, select Admin.
Select Settings > General.
Expand AI-powered features.
Under Features, check that Code Suggestions and Code generation are set to Self-hosted model.

Check that GitLab environmental variables are set up correctly

To check if the GitLab environmental variables are set up correctly, run the following on the GitLab Rails console:

ENV["AI_GATEWAY_URL"] == "<your-ai-gateway-endpoint>"

If the environmental variables are not set up correctly, set them by following the Linux package custom environment variables setting documentation.

Check if GitLab can make an HTTP request to the AI gateway

In the GitLab Rails console, verify that GitLab can make an HTTP request to AI Gateway by running:

HTTParty.get('<your-aigateway-endpoint>/monitoring/healthz', headers: { 'accept' => 'application/json' }).code

If the response is not 200, this means either of the following:

The network is not properly configured to allow GitLab to reach the AI gateway container. Contact your network administrator to verify the setup.
The AI gateway is not able to process requests. To resolve this issue, check if the AI gateway can make a request to the model.

Check if the AI gateway can make a request to the model

From the AI gateway container, make an HTTP request to the AI gateway API for a Code Suggestion. Replace:

<your_model_name> with the name of the model you are using. For example mistral or codegemma.
<your_model_endpoint> with the endpoint where the model is hosted.

docker exec -it <ai-gateway-container> sh
curl --request POST "http://localhost:5052/v1/chat/agent" \
     --header 'accept: application/json' \
     --header 'Content-Type: application/json' \
     --data '{ "prompt_components": [ { "type": "string", "metadata": { "source": "string", "version": "string" }, "payload": { "content": "Hello", "provider": "litellm", "model": "<your_model_name>", "model_endpoint": "<your_model_endpoint>" } } ], "stream": false }'

If the request fails, the:

AI gateway might not be configured properly to use self-hosted models. To resolve this, check that the AI gateway environmental variables are set up correctly.
AI gateway might not be able to access the model. To resolve, check if the model is reachable from the AI gateway.
Model name or endpoint might be incorrect. Check the values, and correct them if necessary.

Check if AI gateway can process requests

docker exec -it <ai-gateway-container> sh
curl '<your-aigateway-endpoint>/monitoring/healthz'

If the response is not 200, this means that AI gateway is not installed correctly. To resolve, follow the documentation on how to install the AI gateway.

Check that the AI gateway environmental variables are set up correctly

To check that the AI gateway environmental variables are set up correctly, run the following in a console on the AI gateway container:

docker exec -it <ai-gateway-container> sh
echo $AIGW_AUTH__BYPASS_EXTERNAL # must be true
echo $AIGW_CUSTOM_MODELS__ENABLED # must be true

If the environmental variables are not set up correctly, set them by creating a container.

Check if the model is reachable from AI gateway

Create a shell on the AI gateway container and make a curl request to the model. If you find that the AI gateway cannot make that request, this might be caused by the:

Model server not functioning correctly.
Network settings around the container not being properly configured to allow requests to where the model is hosted.

To resolve this, contact your network administrator.

The image’s platform does not match the host

When finding the AI gateway release, you might get an error that states The requested image’s platform (linux/amd64) does not match the detected host.

To work around this error, add --platform linux/amd64 to the docker run command:

docker run --platform linux/amd64 -e AIGW_GITLAB_URL=<your-gitlab-endpoint> <image>

LLM server is not available inside the AI gateway container

If the LLM server is installed on the same instance as the AI gateway container, it may not be accessible through the local host.

To resolve this:

Include --network host in the docker run command to enable local requests from the AI gateway container.
Use the -e AIGW_FASTAPI__METRICS_PORT=8083 flag to address the port conflicts.

docker run --network host -e AIGW_GITLAB_URL=<your-gitlab-endpoint> -e AIGW_FASTAPI__METRICS_PORT=8083 <image>

vLLM 404 Error

If you encounter a 404 error while using vLLM, follow these steps to resolve the issue:

Create a chat template file named chat_template.jinja with the following content:

{%- for message in messages %}
  {%- if message["role"] == "user" %}
    {{- "[INST] " + message["content"] + "[/INST]" }}
  {%- elif message["role"] == "assistant" %}
    {{- message["content"] }}
  {%- elif message["role"] == "system" %}
    {{- bos_token }}{{- message["content"] }}
  {%- endif %}
{%- endfor %}

When running the vLLM command, ensure you specify the --served-model-name. For example:

vllm serve "mistralai/Mistral-7B-Instruct-v0.3" --port <port> --max-model-len 17776 --served-model-name mistral --chat-template chat_template.jinja

Check the vLLM server URL in the GitLab UI to make sure that URL includes the /v1 suffix. The correct format is:
```
http(s)://<your-host>:<your-port>/v1
```

Code Suggestions access error

If you are experiencing issues accessing Code Suggestions after setup, try the following steps:

In the Rails console, check and verify the license parameters:

sudo gitlab-rails console
user = User.find(id) # Replace id with the user provisioned with GitLab Duo Enterprise seat
Ability.allowed?(user, :access_code_suggestions) # Must return true

Check if the necessary features are enabled and available:

::Ai::FeatureSetting.code_suggestions_self_hosted? # Should be true

Verify GitLab setup

To verify your GitLab Self-Managed setup, run the following command:

gitlab-rake gitlab:duo:verify_self_hosted_setup

No logs generated in the AI gateway server

If no logs are generated in the AI gateway server, follow these steps to troubleshoot:

Ensure the expanded_ai_logging feature flag is enabled:
```
Feature.enable(:expanded_ai_logging)
```
Run the following commands to view the GitLab Rails logs for any errors:
```
sudo gitlab-ctl tail
sudo gitlab-ctl tail sidekiq
```
Look for keywords like “Error” or “Exception” in the logs to identify any underlying issues.

SSL certificate errors and key de-serialization issues in the AI gateway Container

When attempting to initiate a Duo Chat inside the AI gateway container, SSL certificate errors and key deserialization issues may occur.

The system might encounter issues loading the PEM file, resulting in errors like:

JWKError: Could not deserialize key data. The data may be in an incorrect format, the provided password may be incorrect, or it may be encrypted with an unsupported algorithm.

To resolve the SSL certificate error:

Set the appropriate certificate bundle path in the Docker container using the following environment variables:
- SSL_CERT_FILE=/path/to/ca-bundle.pem
- REQUESTS_CA_BUNDLE=/path/to/ca-bundle.pem

Troubleshooting common Duo Chat errors

Error A1000

You might get an error that states I'm sorry, I couldn't respond in time. Please try again. Error code: A1000.

This error occurs when there is a timeout during processing. Try your request again.

Error A1001

You might get an error that states I'm sorry, I can't generate a response. Please try again. Error code: A1001.

This error means there was a problem connecting to the AI gateway. You might need to check the network settings and ensure that the AI gateway is accessible from the GitLab instance.

Use the self-hosted debugging script to verify if the AI gateway is accessible from the GitLab instance and is working as expected.

If problem persists, report the issue to the GitLab support team.

Error A1002

You might get an error that states I'm sorry, I couldn't respond in time. Please try again. Error code: A1002.

This error occurs when no events are returned from AI gateway or GitLab failed to parse the events. Check the AI Gateway logs for any errors.

Error A1003

You might get an error that states I'm sorry, I couldn't respond in time. Please try again. Error code: A1003.

This error typically occurs due to issues with streaming from the model to the AI gateway. To resolve this issue:

In the AI gateway container, run the following command:

curl --request 'POST' \
'http://localhost:5052/v2/chat/agent' \
--header 'accept: application/json' \
--header 'Content-Type: application/json' \
--header 'x-gitlab-enabled-feature-flags: expanded_ai_logging' \
--data '{
  "messages": [
    {
      "role": "user",
      "content": "Hello",
      "context": null,
      "current_file": null,
      "additional_context": []
    }
  ],
  "model_metadata": {
    "provider": "custom_openai",
    "name": "mistral",
    "endpoint": "<change here>",
    "api_key": "<change here>",
    "identifier": "<change here>"
  },
  "unavailable_resources": [],
  "options": {
    "agent_scratchpad": {
      "agent_type": "react",
      "steps": []
    }
  }
}'

If streaming is working, chunked responses should be displayed. If it is not, it will likely show an empty response.

Check the AI gateway logs for specific error messages, because this is usually a model deployment issue.
To validate the connection, disable the streaming by setting the AIGW_CUSTOM_MODELS__DISABLE_STREAMING environment variable in your AI gateway container:
```
docker run .... -e AIGW_CUSTOM_MODELS__DISABLE_STREAMING=true ...
```

Error A9999

You might get an error that states I'm sorry, I can't generate a response. Please try again. Error code: A9999.

This error occurs when an unknown error occurs in ReAct agent. Try your request again. If the problem persists, report the issue to the GitLab support team.

GitLab Duo troubleshooting

Troubleshooting GitLab Duo Self-Hosted Models

Use debugging scripts

Check if GitLab can make a request to the model

Check if a user can request Code Suggestions

Check if GitLab instance is configured to use self-hosted-models

Check that GitLab environmental variables are set up correctly

Check if GitLab can make an HTTP request to the AI gateway

Check if the AI gateway can make a request to the model

Check if AI gateway can process requests

Check that the AI gateway environmental variables are set up correctly

Check if the model is reachable from AI gateway

The image’s platform does not match the host

LLM server is not available inside the AI gateway container

vLLM 404 Error

Code Suggestions access error

Verify GitLab setup

No logs generated in the AI gateway server

SSL certificate errors and key de-serialization issues in the AI gateway Container

Troubleshooting common Duo Chat errors

Error A1000

Error A1001

Error A1002

Error A1003

Error A9999

Help & feedback

Docs

Product

Feature availability and product trials

Get Help

Troubleshooting GitLab Duo Self-Hosted Models

Use debugging scripts

Check if GitLab can make a request to the model

Check if a user can request Code Suggestions

Check if GitLab instance is configured to use self-hosted-models

Check that GitLab environmental variables are set up correctly

Check if GitLab can make an HTTP request to the AI gateway

Check if the AI gateway can make a request to the model

Check if AI gateway can process requests

Check that the AI gateway environmental variables are set up correctly

Check if the model is reachable from AI gateway

The image’s platform does not match the host

LLM server is not available inside the AI gateway container

vLLM 404 Error

Code Suggestions access error

Verify GitLab setup

No logs generated in the AI gateway server

SSL certificate errors and key de-serialization issues in the AI gateway Container

Troubleshooting common Duo Chat errors

Error A1000

Error A1001

Error A1002

Error A1003

Error A9999

Related topics

Help & feedback

Docs

Product

Feature availability and product trials

Get Help