Evaluation runner

Evaluation runner (evaluation-runner) allows GitLab employees to run evaluations on specific GitLab AI features with one click.

You can run the evaluation on GitLab.com and GitLab-supported self-hosted models.
To view the AI features that are currently supported, see Evaluation pipelines.

Evaluation runner spins up a new GDK instance on a remote environment, runs an evaluation, and reports the result.

For more details, view the evaluation-runner repository.

Architecture

flowchart LR
  subgraph EV["Evaluators"]
    PL(["PromptLibrary/ELI5"])
    DSIN(["Input Dataset"])
  end

  subgraph ER["EvaluationRunner"]
    CI["CI/CD pipelines"]
    subgraph GDKS["Remote GDKs"]
        subgraph GDKM["GDK-master"]
          bl1["Duo features on master branch"]
          fi1["fixtures (Issue,MR,etc)"]
        end
        subgraph GDKF["GDK-feature"]
          bl2["Duo features on feature branch"]
          fi2["fixtures (Issue,MR,etc)"]
        end
    end
  end

  subgraph MR["MergeRequests"]
    GRMR["GitLab-Rails MR"]
    GRAI["AI Gateway MR"]
  end

  MR -- [1] trigger --- CI
  CI -- [2] spins up --- GDKS
  PL -- [3] get responses and evaluate --- GDKS

Docs

Edit this page to fix an error or add an improvement in a merge request.

Create an issue to suggest an improvement to this page.

Product

Create an issue if there's something you don't like about this feature.

Propose functionality by submitting a feature request.

Feature availability and product trials

View pricing to see all GitLab tiers and features, or to upgrade.

Try GitLab for free with access to all features for 30 days.

Get help

If you didn't find what you were looking for, search the docs.

If you want help with something specific and could use community support, post on the GitLab forum.

For problems setting up or using this feature (depending on your GitLab subscription).

Request support