MLflow client compatibility

Tier: Free, Premium, Ultimate
Offering: GitLab.com, GitLab Self-Managed, GitLab Dedicated

History

MLflow is a popular open source tool for Machine Learning experiment tracking. GitLab Model experiment tracking and GitLab Model registry are compatible with the MLflow client. The setup requires minimal changes to existing code.

Enable MLflow client integration

Prerequisites:

A personal, project, or group access token with at least the Developer role and the api scope.
The project ID. To find the project ID:
1. On the left sidebar, select Search or go to and find your project.
2. Select Settings > General.

To use MLflow client compatibility from a local environment:

Set the tracking URI and token environment variables on the host that runs the code. This can be your local environment, CI pipeline, or remote host. For example:
Shell Copy to clipboard
```
export MLFLOW_TRACKING_URI="<your gitlab endpoint>/api/v4/projects/<your project id>/ml/mlflow"
export MLFLOW_TRACKING_TOKEN="<your_access_token>"
```
If the training code contains the call to mlflow.set_tracking_uri(), remove it.

In the model registry, you can copy the tracking URI from the overflow menu in the top right by selecting the vertical ellipsis ( ).

Model experiments

When running the training code, MLflow client can be used to create experiments, runs, models, model versions, log parameters, metrics, metadata, and artifacts on GitLab.

After experiments are logged, they are listed under /<your project>/-/ml/experiments.

Runs are registered and can be explored by selecting an experiment, model, or model version.

Creating an experiment

 Python Copy to clipboard  
import mlflow

# Create a new experiment
experiment_id = mlflow.create_experiment(name="<your_experiment>")

# Setting the active experiment also creates a new experiment if it doesn't exist.
mlflow.set_experiment(experiment_name="<your_experiment>")

Creating a run

 Python Copy to clipboard  
import mlflow

# Creating a run requires an experiment ID or an active experiment
mlflow.set_experiment(experiment_name="<your_experiment>")

# Runs can be created with or without a context manager
with mlflow.start_run() as run:
    print(run.info.run_id)
    # Your training code

with mlflow.start_run():
    # Your training code

Logging parameters and metrics

 Python Copy to clipboard  
import mlflow

mlflow.set_experiment(experiment_name="<your_experiment>")

with mlflow.start_run():
    # Parameter keys need to be unique in the scope of the run
    mlflow.log_param(key="param_1", value=1)

    # Metrics can be updated throughout the run
    mlflow.log_metric(key="metrics_1", value=1)
    mlflow.log_metric(key="metrics_1", value=2)

Logging artifacts

 Python Copy to clipboard  
import mlflow

mlflow.set_experiment(experiment_name="<your_experiment>")

with mlflow.start_run():
    # Plaintext text files can be logged as artifacts using `log_text`
    mlflow.log_text('Hello, World!', artifact_file='hello.txt')

    mlflow.log_artifact(
        local_path='<local/path/to/file.txt>',
        artifact_path='<optional relative path to log the artifact at>'
    )

Logging models

Models can be logged using one of the supported MLflow Model flavors. Logging with a model flavor records the metadata, making it easier to manage, load, and deploy models across different tools and environments.

 Python Copy to clipboard  
import mlflow
from sklearn.ensemble import RandomForestClassifier

mlflow.set_experiment(experiment_name="<your_experiment>")

with mlflow.start_run():
    # Create and train a simple model
    model = RandomForestClassifier(n_estimators=10, random_state=42)
    model.fit(X_train, y_train)

    # Log the model using MLflow sklearn mode flavour
    mlflow.sklearn.log_model(model, artifact_path="")

Loading a run

History

You can load a run from the GitLab model registry to, for example, make predictions.

 Python Copy to clipboard  
import mlflow
import mlflow.pyfunc

run_id = "<your_run_id>"
download_path = "models"  # Local folder to download to

mlflow.pyfunc.load_model(f"runs:/{run_id}/", dst_path=download_path)

sample_input = [[1,0,3,4],[2,0,1,2]]
model.predict(data=sample_input)

Associating a run to a CI/CD job

History

If your training code is being run from a CI/CD job, GitLab can use that information to enhance run metadata. To associate a run to a CI/CD job:

In the Project CI variables, include the following variables:
- MLFLOW_TRACKING_URI: "<your gitlab endpoint>/api/v4/projects/<your project id>/ml/mlflow"
- MLFLOW_TRACKING_TOKEN: <your_access_token>

In your training code within the run execution context, add the following code snippet:

 Python Copy to clipboard  
import os
import mlflow

with mlflow.start_run(run_name=f"Run {index}"):
  # Your training code

  # Start of snippet to be included
  if os.getenv('GITLAB_CI'):
    mlflow.set_tag('gitlab.CI_JOB_ID', os.getenv('CI_JOB_ID'))
  # End of snippet to be included

Model registry

You can also manage models and model versions by using the MLflow client. Models are registered under /<your project>/-/ml/models.

Models

Creating a model

 Python Copy to clipboard  
from mlflow import MlflowClient

client = MlflowClient()
model_name = '<your_model_name>'
description = 'Model description'
model = client.create_registered_model(model_name, description=description)

Notes

create_registered_model argument tags is ignored.
name must be unique within the project.
name cannot be the name of an existing experiment.

Fetching a model

 Python Copy to clipboard  
from mlflow import MlflowClient

client = MlflowClient()
model_name = '<your_model_name>'
model = client.get_registered_model(model_name)

Updating a model

 Python Copy to clipboard  
from mlflow import MlflowClient

client = MlflowClient()
model_name = '<your_model_name>'
description = 'New description'
client.update_registered_model(model_name, description=description)

Deleting a model

 Python Copy to clipboard  
from mlflow import MlflowClient

client = MlflowClient()
model_name = '<your_model_name>'
client.delete_registered_model(model_name)

Logging runs to a model

Every model has an associated experiment with the same name prefixed by [model]. To log a run to the model, use the experiment passing the correct name:

 Python Copy to clipboard  
from mlflow import MlflowClient

client = MlflowClient()
model_name = '<your_model_name>'
exp = client.get_experiment_by_name(f"[model]{model_name}")
run = client.create_run(exp.experiment_id)

Model version

Creating a model version

 Python Copy to clipboard  
from mlflow import MlflowClient

client = MlflowClient()
model_name = '<your_model_name>'
description = 'Model version description'
model_version = client.create_model_version(model_name, source="", description=description)

If the version parameter is not passed, it will be auto-incremented from the latest uploaded version. You can set the version by passing a tag during model version creation. The version must follow SemVer format.

 Python Copy to clipboard  
from mlflow import MlflowClient

client = MlflowClient()
model_name = '<your_model_name>'
version = '<your_version>'
tags = { "gitlab.version": version }
client.create_model_version(model_name, version, description=description, tags=tags)

Notes

Argument run_id is ignored. Every model version behaves as a run. Creating a mode version from a run is not yet supported.
Argument source is ignored. GitLab will create a package location for the model version files.
Argument run_link is ignored.
Argument await_creation_for is ignored.

Updating a model

 Python Copy to clipboard  
from mlflow import MlflowClient

client = MlflowClient()
model_name = '<your_model_name>'
version = '<your_version>'
description = 'New description'
client.update_model_version(model_name, version, description=description)

Fetching a model version

 Python Copy to clipboard  
from mlflow import MlflowClient

client = MlflowClient()
model_name = '<your_model_name>'
version = '<your_version>'
client.get_model_version(model_name, version)

Getting latest versions of a model

 Python Copy to clipboard  
from mlflow import MlflowClient

client = MlflowClient()
model_name = '<your_model_name>'
client.get_latest_versions(model_name)

Notes

Argument stages is ignored.
Versions are ordered by highest semantic version.

Loading a model version

 Python Copy to clipboard  
from mlflow import MlflowClient

client = MlflowClient()
model_name = '<your_model_name>'
version = '<your_version'  # e.g. '1.0.0'

# Alternatively search the version
version = mlflow.search_registered_models(filter_string="name='{model_name}'")[0].latest_versions[0].version

model = mlflow.pyfunc.load_model(f"models:/{model_name}/{latest_version}")

# Or load the latest version
model = mlflow.pyfunc.load_model(f"models:/{model_name}/latest")

Logging metrics and parameters to a model version

Every model version is also a run, allowing users to log parameters and metrics. The run ID can either be found at the Model version page in GitLab, or by using the MLflow client:

 Python Copy to clipboard  
from mlflow import MlflowClient

client = MlflowClient()
model_name = '<your_model_name>'
version = '<your_version>'
model_version = client.get_model_version(model_name, version)
run_id = model_version.run_id

# Your training code

client.log_metric(run_id, '<metric_name>', '<metric_value>')
client.log_param(run_id, '<param_name>', '<param_value>')
client.log_batch(run_id, metric_list, param_list, tag_list)

Because each file has a size limit of 5 GB, you must partition larger models.

Logging artifacts to a model version

GitLab creates a package that can be used by the MLflow client to upload files.

 Python Copy to clipboard  
from mlflow import MlflowClient

client = MlflowClient()
model_name = '<your_model_name>'
version = '<your_version>'
model_version = client.get_model_version(model_name, version)
run_id = model_version.run_id

# Your training code

client.log_artifact(run_id, '<local/path/to/file.txt>', artifact_path="")
client.log_figure(run_id, figure, artifact_file="my_plot.png")
client.log_dict(run_id, my_dict, artifact_file="my_dict.json")
client.log_image(run_id, image, artifact_file="image.png")

Artifacts will then be available under https/<your project>/-/ml/models/<model_id>/versions/<version_id>.

Linking a model version to a CI/CD job

Similar to runs, it is also possible to link a model version to a CI/CD job:

 Python Copy to clipboard  
import os
from mlflow import MlflowClient

client = MlflowClient()
model_name = '<your_model_name>'
version = '<your_version>'
model_version = client.get_model_version(model_name, version)
run_id = model_version.run_id

# Your training code

if os.getenv('GITLAB_CI'):
    client.set_tag(model_version.run_id, 'gitlab.CI_JOB_ID', os.getenv('CI_JOB_ID'))

Supported MLflow client methods and caveats

GitLab supports the following methods from the MLflow client. More information can be found in the MLflow Documentation. The MlflowClient counterparts of the methods below are also supported with the same caveats.

Method	Supported	Version Added	Comments
`create_experiment`	Yes	15.11
`get_experiment`	Yes	15.11
`get_experiment_by_name`	Yes	15.11
`delete_experiment`	Yes	17.5
`set_experiment`	Yes	15.11
`get_run`	Yes	15.11
`delete_run`	Yes	17.5
`start_run`	Yes	15.11	(16.3) If a name is not provided, the run receives a random nickname.
`search_runs`	Yes	15.11	(16.4) `experiment_ids` supports only a single experiment ID with order by column or metric.
`log_artifact`	Yes with caveat	15.11	(15.11) `artifact_path` must be empty. Does not support directories.
`log_artifacts`	Yes with caveat	15.11	(15.11) `artifact_path` must be empty. Does not support directories.
`log_batch`	Yes	15.11
`log_metric`	Yes	15.11
`log_metrics`	Yes	15.11
`log_param`	Yes	15.11
`log_params`	Yes	15.11
`log_figure`	Yes	15.11
`log_image`	Yes	15.11
`log_text`	Yes with caveat	15.11	(15.11) Does not support directories.
`log_dict`	Yes with caveat	15.11	(15.11) Does not support directories.
`set_tag`	Yes	15.11
`set_tags`	Yes	15.11
`set_terminated`	Yes	15.11
`end_run`	Yes	15.11
`update_run`	Yes	15.11
`log_model`	Partial	15.11	(15.11) Saves the artifacts, but not the model data. `artifact_path` must be empty.
`load_model`	Yes	17.5
`download_artifacts`	Yes	17.9
`list_artifacts`	Yes	17.9

Other MLflowClient methods:

Method	Supported	Version added	Comments
`create_registered_model`	Yes with caveats	16.8	See notes
`get_registered_model`	Yes	16.8
`delete_registered_model`	Yes	16.8
`update_registered_model`	Yes	16.8
`create_model_version`	Yes with caveats	16.8	See notes
`get_model_version`	Yes	16.8
`get_latest_versions`	Yes with caveats	16.8	See notes
`update_model_version`	Yes	16.8
`create_registered_model`	Yes	16.8
`create_registered_model`	Yes	16.8

Known issues

The API GitLab supports is the one defined at MLflow version 2.7.1.
MLflow client methods not listed in supported methods might still work but have not been tested.
During creation of experiments and runs, ExperimentTags are stored, even though they are not displayed.

Docs

Edit this page to fix an error or add an improvement in a merge request.

Create an issue to suggest an improvement to this page.

Product

Create an issue if there's something you don't like about this feature.

Propose functionality by submitting a feature request.

Feature availability and product trials

View pricing to see all GitLab tiers and features, or to upgrade.

Try GitLab for free with access to all features for 30 days.

Get help

If you didn't find what you were looking for, search the docs.

If you want help with something specific and could use community support, post on the GitLab forum.

For problems setting up or using this feature (depending on your GitLab subscription).

Request support

MLflow client compatibility

Enable MLflow client integration

Model experiments

Creating an experiment

Creating a run

Logging parameters and metrics

Logging artifacts

Logging models

Loading a run

Associating a run to a CI/CD job

Model registry

Models

Creating a model

Fetching a model

Updating a model

Deleting a model

Logging runs to a model

Model version

Creating a model version

Updating a model

Fetching a model version

Getting latest versions of a model

Loading a model version

Logging metrics and parameters to a model version

Logging artifacts to a model version

Linking a model version to a CI/CD job

Supported MLflow client methods and caveats

Known issues

Help & feedback

Docs

Product

Feature availability and product trials

Get help