Container registry metadata database

Tier: Free, Premium, Ultimate Offering: Self-managed Status: Beta
History
caution
The metadata database is a beta feature. Carefully review the documentation before enabling the registry database in production! If you encounter a problem with either the import or operation of the registry, please add a comment in the feedback issue.

The metadata database enables many new registry features, including online garbage collection, and increases the efficiency of many registry operations. The work on the self-managed release of the registry metadata database feature is tracked in epic 5521.

By default, the container registry uses object storage to persist metadata related to container images. This method to store metadata limits how efficiently the data can be accessed, especially data spanning multiple images, such as when listing tags. By using a database to store this data, many new features are possible, including online garbage collection which removes old data automatically with zero downtime.

This database works in conjunction with the object storage already used by the registry, but does not replace object storage. You must continue to maintain an object storage solution even after migrating to a metadata database.

Known Limitations

  • No support for online migrations.
  • Geo Support is not confirmed.
  • Registry database migrations must be run manually when upgrading versions.
  • No guarantee for registry zero downtime during upgrades on multi-node Omnibus GitLab environments.

Metadata database feature support

You can migrate existing registries to the metadata database, and use online garbage collection.

Some database-enabled features are only enabled for GitLab.com and automatic database provisioning for the registry database is not available. Review the feature support table in the feedback issue for the status of features related to the container registry database.

Enable the metadata database for Linux package installations

Prerequisites:

  • GitLab 16.7 or later.
  • PostgreSQL database version 12 or later. It must be accessible from the registry node.

Follow the instructions that match your situation:

  • New installation or enabling the container registry for the first time.
  • Migrate existing container images to the metadata database:

Before you start

  • After you enable the database, you must continue to use it. The database is now the source of the registry metadata, disabling it after this point causes the registry to lose visibility on all images written to it while the database was active.
  • Never run offline garbage collection at any point after the import step has been completed. That command is not compatible with registries using the metadata database, and it deletes data.
  • Verify you have not automated offline garbage collection.
  • You can first reduce the storage of your registry to speed up the process.
  • Back up your container registry data if possible.

New installations

To enable the database:

  1. Edit /etc/gitlab/gitlab.rb by adding your database connection details, but start with the metadata database disabled:

    registry['database'] = {
      'enabled' => false,
      'host' => 'localhost',
      'port' => 5432,
      'user' => 'registry-database-user',
      'password' => 'registry-database-password',
      'dbname' => 'registry-database-name',
      'sslmode' => 'require', # See the PostgreSQL documentation for additional information https://www.postgresql.org/docs/current/libpq-ssl.html.
      'sslcert' => '/path/to/cert.pem',
      'sslkey' => '/path/to/private.key',
      'sslrootcert' => '/path/to/ca.pem'
    }
    
  2. Save the file and reconfigure GitLab.
  3. Apply schema migrations.
  4. Enable the database by editing /etc/gitlab/gitlab.rb and setting enabled to true:

    registry['database'] = {
      'enabled' => true,
      'host' => 'localhost',
      'port' => 5432,
      'user' => 'registry-database-user',
      'password' => 'registry-database-password',
      'dbname' => 'registry-database-name',
      'sslmode' => 'require', # See the PostgreSQL documentation for additional information https://www.postgresql.org/docs/current/libpq-ssl.html.
      'sslcert' => '/path/to/cert.pem',
      'sslkey' => '/path/to/private.key',
      'sslrootcert' => '/path/to/ca.pem'
    }
    
  5. Save the file and reconfigure GitLab.

Existing registries

You can migrate your existing container registry data in one step or three steps. A few factors affect the duration of the migration:

  • The size of your existing registry data.
  • The specifications of your PostgresSQL instance.
  • The number of registry instances running.
  • Network latency between the registry, PostgresSQL and your configured Object Storage.

Choose the one or three step method according to your registry installation.

One-step migration

caution
The registry must be shut down or remain in read-only mode during the migration. Only choose this method if you do not need to write to the registry during the migration and your registry contains a relatively small amount of data.
  1. Add the database section to your /etc/gitlab/gitlab.rb file, but start with the metadata database disabled:

    registry['database'] = {
      'enabled' => false, # Must be false!
      'host' => 'localhost',
      'port' => 5432,
      'user' => 'registry-database-user',
      'password' => 'registry-database-password',
      'dbname' => 'registry-database-name'
      'sslmode' => 'require', # See the PostgreSQL documentation for additional information https://www.postgresql.org/docs/current/libpq-ssl.html.
      'sslcert' => '/path/to/cert.pem',
      'sslkey' => '/path/to/private.key',
      'sslrootcert' => '/path/to/ca.pem'
    }
    
  2. Ensure the registry is set to read-only mode.

    Edit your /etc/gitlab/gitlab.rb and add the maintenance section to the registry['storage'] configuration. For example, for a gcs backed registry using a gs://my-company-container-registry bucket, the configuration could be:

    ## Object Storage - Container Registry
    registry['storage'] = {
      'gcs' => {
        'bucket' => "my-company-container-registry",
        'chunksize' => 5242880
      },
      'maintenance' => {
        'readonly' => {
          'enabled' => true # Must be set to true.
        }
      }
    }
    
  3. Save the file and reconfigure GitLab.
  4. Apply schema migrations if you have not done so.
  5. Run the following command:

    sudo gitlab-ctl registry-database import
    
  6. If the command completed successfully, the registry is now fully imported. You can now enable the database, turn off read-only mode in the configuration, and start the registry service:

    registry['database'] = {
      'enabled' => true, # Must be enabled now!
      'host' => 'localhost',
      'port' => 5432,
      'user' => 'registry-database-user',
      'password' => 'registry-database-password',
      'dbname' => 'registry-database-name',
      'sslmode' => 'require', # See the PostgreSQL documentation for additional information https://www.postgresql.org/docs/current/libpq-ssl.html.
      'sslcert' => '/path/to/cert.pem',
      'sslkey' => '/path/to/private.key',
      'sslrootcert' => '/path/to/ca.pem'
    }
    
    ## Object Storage - Container Registry
    registry['storage'] = {
      'gcs' => {
        'bucket' => "my-company-container-registry",
        'chunksize' => 5242880
      },
      'maintenance' => {
        'readonly' => {
          'enabled' => false
        }
      }
    }
    
  7. Save the file and reconfigure GitLab.

You can now use the metadata database for all operations!

Three-step migration

Follow this guide to migrate your existing container registry data. This procedure is recommended for larger sets of data or if you are trying to minimize downtime while completing the migration.

note
Users have reported step one import completed at rates of 2 to 4 TB per hour. At the slower speed, registries with over 100TB of data could take longer than 48 hours.
Pre-import repositories (step one)

For larger instances, this command can take hours to days to complete, depending on the size of your registry. You may continue to use the registry as normal while step one is being completed.

caution
It is not yet possible to restart the migration, so it’s important to let the migration run to completion. If you must halt the operation, you have to restart this step.
  1. Add the database section to your /etc/gitlab/gitlab.rb file, but start with the metadata database disabled:

    registry['database'] = {
      'enabled' => false, # Must be false!
      'host' => 'localhost',
      'port' => 5432,
      'user' => 'registry-database-user',
      'password' => 'registry-database-password',
      'dbname' => 'registry-database-name'
      'sslmode' => 'require', # See the PostgreSQL documentation for additional information https://www.postgresql.org/docs/current/libpq-ssl.html.
      'sslcert' => '/path/to/cert.pem',
      'sslkey' => '/path/to/private.key',
      'sslrootcert' => '/path/to/ca.pem'
    }
    
  2. Save the file and reconfigure GitLab.
  3. Apply schema migrations if you have not done so.
  4. Run the first step to begin the migration:

    sudo gitlab-ctl registry-database import --step-one
    
note
You should try to schedule the following step as soon as possible to reduce the amount of downtime required. Ideally, less than one week after step one completes. Any new data written to the registry between steps one and two, causes step two to take more time.
Import all repository data (step two)

This step requires the registry to be shut down or set in read-only mode. Allow enough time for downtime while step two is being executed.

  1. Ensure the registry is set to read-only mode.

    Edit your /etc/gitlab/gitlab.rb and add the maintenance section to the registry['storage'] configuration. For example, for a gcs backed registry using a gs://my-company-container-registry bucket , the configuration could be:

    ## Object Storage - Container Registry
    registry['storage'] = {
      'gcs' => {
        'bucket' => "my-company-container-registry",
        'chunksize' => 5242880
      },
      'maintenance' => {
        'readonly' => {
          'enabled' => true # Must be set to true.
        }
      }
    }
    
  2. Save the file and reconfigure GitLab.
  3. Run step two of the migration

    sudo gitlab-ctl registry-database import --step-two
    
  4. If the command completed successfully, all images are now fully imported. You can now enable the database, turn off read-only mode in the configuration, and start the registry service:

    registry['database'] = {
      'enabled' => true, # Must be set to true!
      'host' => 'localhost',
      'port' => 5432,
      'user' => 'registry-database-user',
      'password' => 'registry-database-password',
      'dbname' => 'registry-database-name',
      'sslmode' => 'require', # See the PostgreSQL documentation for additional information https://www.postgresql.org/docs/current/libpq-ssl.html.
      'sslcert' => '/path/to/cert.pem',
      'sslkey' => '/path/to/private.key',
      'sslrootcert' => '/path/to/ca.pem'
    }
    
    ## Object Storage - Container Registry
    registry['storage'] = {
      'gcs' => {
        'bucket' => "my-company-container-registry",
        'chunksize' => 5242880
      },
      'maintenance' => { # This section can be removed.
        'readonly' => {
          'enabled' => false
        }
      }
    }
    
  5. Save the file and reconfigure GitLab.

You can now use the metadata database for all operations!

Import the rest of the data (step three)

Even though the registry is now fully using the database for its metadata, it does not yet have access to any potentially unused layer blobs.

To complete the process, run the final step of the migration:

sudo gitlab-ctl registry-database import --step-three

After that command exists successfully, the registry is now fully migrated to the database!

Manage schema migrations

Use the following commands to run the schema migrations for the Container registry metadata database. The registry must be enabled and the configuration section must have the database section filled.

Apply schema migrations

  1. Run the registry database schema migrations

    sudo gitlab-ctl registry-database migrate up
    
  2. The registry must stop if it’s running. Type y to confirm and wait for the process to finish.

note
The migrate up command offers some extra flags that can be used to control how the migrations are applied. Run sudo gitlab-ctl registry-database migrate up --help for details.

Undo schema migrations

You can undo schema migrations in case anything goes wrong, but this is a non-recoverable action. If you pushed new images while the database was in use, they will no longer be accessible after this.

  1. Undo the registry database schema migrations:

    sudo gitlab-ctl registry-database migrate down
    
note
The migrate down command offers some extra flags. Run sudo gitlab-ctl registry-database migrate down --help for details.

Online garbage collection monitoring

The initial runs of online garbage collection following the import process varies in duration based on the number of imported images. You should monitor the efficiency and health of your online garbage collection during this period.

Monitor database performance

After completing an import, expect the database to experience a period of high load as the garbage collection queues drain. This high load is caused by a high number of individual database calls from the online garbage collector processing the queued tasks.

Regularly check PostgreSQL and registry logs for any errors or warnings. In the registry logs, pay special attention to logs filtered by component=registry.gc.*.

Track metrics

Use monitoring tools like Prometheus and Grafana to visualize and track garbage collection metrics, focusing on metrics with a prefix of registry_gc_*. These include the number of objects marked for deletion, objects successfully deleted, run intervals, and durations.

Queue monitoring

Check the size of the queues by counting the rows in the gc_blob_review_queue and gc_manifest_review_queue tables. Large queues are expected initially, with the number of rows proportional to the number of imported blobs and manifests. The queues should reduce over time, indicating that garbage collection is successfully reviewing jobs.

SELECT COUNT(*) FROM gc_blob_review_queue;
SELECT COUNT(*) FROM gc_manifest_review_queue;

Interpreting Queue Sizes:

  • Shrinking queues: Indicate garbage collection is successfully processing tasks.
  • Near-Zero gc_manifest_review_queue: Most images flagged for potential deletion have been reviewed and classified either as still in use or removed.
  • Overdue Tasks: Check for overdue GC tasks by running the following queries:

    SELECT COUNT(*) FROM gc_blob_review_queue WHERE review_after < NOW();
    SELECT COUNT(*) FROM gc_manifest_review_queue WHERE review_after < NOW();
    

    A high number of overdue tasks indicates a problem. Large queue sizes are not concerning as long as they are decreasing over time and the number of overdue tasks is close to zero. A high number of overdue tasks should prompt an urgent inspection of logs.

Check GC logs for messages indicating that blobs are still in use, for example msg=the blob is not dangling, which implies they will not be deleted.

Adjust blobs interval

If the size of your gc_blob_review_queue is high, and you want to increase the frequency between the garbage collection blob or manifest worker runs, update your interval configuration from the default (5s) to 1s:

registry['gc'] = {
  'blobs' => {
    'interval' => '1s'
  },
  'manifests' => {
    'interval' => '1s'
  }
}

After the migration load has been cleared, you should fine-tune these settings for the long term to avoid unnecessary CPU load on the database and registry instances. You can gradually increase the interval to a value that balances performance and resource usage.

Validate data consistency

To ensure data consistency after the import, use the crane validate tool. This tool checks that all image layers and manifests in your container registry are accessible and correctly linked. By running crane validate, you confirm that the images in your registry are complete and accessible, ensuring a successful import.

Review cleanup policies

If most of your images are tagged, garbage collection won’t significantly reduce storage space because it only deletes untagged images.

Implement cleanup policies to remove unneeded tags, which eventually causes images to be removed through garbage collection and storage space being recovered.

Troubleshooting

there are pending database migrations error

If the registry has been updated and there are pending schema migrations, the registry fails to start with the following error message:

FATA[0000] configuring application: there are pending database migrations, use the 'registry database migrate' CLI command to check and apply them

To fix this issue, follow the steps to apply schema migrations.

offline garbage collection is no longer possible error

If the registry uses the metadata database and you try to run offline garbage collection, the registry fails with the following error message:

ERRO[0000] this filesystem is managed by the metadata database, and offline garbage collection is no longer possible, if you are not using the database anymore, remove the file at the lock_path in this log message lock_path=/docker/registry/lockfiles/database-in-use

You must either:

  • Stop using offline garbage collection.
  • If you no longer use the metadata database, delete the indicated lock file at the lock_path shown in the error message. For example, remove the /docker/registry/lockfiles/database-in-use file.