Evaluation runner contribute

Evaluation runner (evaluation-runner) allows GitLab employees to run evaluations on specific GitLab AI features with one click.

  • You can run the evaluation on GitLab.com and GitLab-supported self-hosted models.
  • To view the AI features that are currently supported, see Evaluation pipelines.

Evaluation runner spins up a new GDK instance on a remote environment, runs an evaluation, and reports the result.

For more details, view the evaluation-runner repository.

Architecture

EvaluationRunner
Evaluators
[1] trigger
[2] spins up
[3] get responses and evaluate
MergeRequests
GitLab-Rails MR
AI Gateway MR
CI/CD pipelines
Remote GDKs
GDK-master
Duo features on master branch
fixtures (Issue,MR,etc)
GDK-feature
Duo features on feature branch
fixtures (Issue,MR,etc)
PromptLibrary/ELI5
Input Dataset