Supported Geo data types

Tier: Premium, Ultimate
Offering: GitLab Self-Managed

A Geo data type is a specific class of data that is required by one or more GitLab features to store relevant information.

To replicate data produced by these features with Geo, we use several strategies to access, transfer, and verify them.

Data types

We distinguish between the following different data types:

See the list below of each feature or component we replicate, its corresponding data type, replication, and verification methods:

Type	Feature / component	Replication method	Verification method
Database	Application data in PostgreSQL	Native	Native
Database	Redis	Not applicable ¹	Not applicable
Database	Elasticsearch	Native	Native
Database	SSH public keys	PostgreSQL Replication	PostgreSQL Replication
Git	Project repository	Geo with Gitaly	Gitaly Checksum
Git	Project wiki repository	Geo with Gitaly	Gitaly Checksum
Git	Project designs repository	Geo with Gitaly	Gitaly Checksum
Git	Project Snippets	Geo with Gitaly	Gitaly Checksum
Git	Personal Snippets	Geo with Gitaly	Gitaly Checksum
Git	Group wiki repository	Geo with Gitaly	Gitaly Checksum
Blob	User uploads (file system)	Geo with API	SHA256 checksum
Blob	User uploads (object storage)	Geo with API/Managed ²	SHA256 checksum ³
Blob	LFS objects (file system)	Geo with API	SHA256 checksum
Blob	LFS objects (object storage)	Geo with API/Managed ²	SHA256 checksum ³
Blob	CI job artifacts (file system)	Geo with API	SHA256 checksum
Blob	CI job artifacts (object storage)	Geo with API/Managed ²	SHA256 checksum ³
Blob	Archived CI build traces (file system)	Geo with API	Not implemented
Blob	Archived CI build traces (object storage)	Geo with API/Managed ²	SHA256 checksum ³
Blob	Container registry (file system)	Geo with API/Docker API	SHA256 checksum
Blob	Container registry (object storage)	Geo with API/Managed/Docker API ²	SHA256 checksum ³
Blob	Package registry (file system)	Geo with API	SHA256 checksum
Blob	Package registry (object storage)	Geo with API/Managed ²	SHA256 checksum ³
Blob	Terraform Module Registry (file system)	Geo with API	SHA256 checksum
Blob	Terraform Module Registry (object storage)	Geo with API/Managed ²	SHA256 checksum ³
Blob	Versioned Terraform State (file system)	Geo with API	SHA256 checksum
Blob	Versioned Terraform State (object storage)	Geo with API/Managed ²	SHA256 checksum ³
Blob	External merge request diffs (file system)	Geo with API	SHA256 checksum
Blob	External merge request diffs (object storage)	Geo with API/Managed ²	SHA256 checksum ³
Blob	Pipeline artifacts (file system)	Geo with API	SHA256 checksum
Blob	Pipeline artifacts (object storage)	Geo with API/Managed ²	SHA256 checksum ³
Blob	Pages (file system)	Geo with API	SHA256 checksum
Blob	Pages (object storage)	Geo with API/Managed ²	SHA256 checksum ³
Blob	CI Secure Files (file system)	Geo with API	SHA256 checksum
Blob	CI Secure Files (object storage)	Geo with API/Managed ²	SHA256 checksum ³
Blob	Incident Metric Images (file system)	Geo with API/Managed	SHA256 checksum
Blob	Incident Metric Images (object storage)	Geo with API/Managed ²	SHA256 checksum ³
Blob	Alert Metric Images (file system)	Geo with API	SHA256 checksum
Blob	Alert Metric Images (object storage)	Geo with API/Managed ²	SHA256 checksum ³
Blob	Dependency Proxy Images (file system)	Geo with API	SHA256 checksum
Blob	Dependency Proxy Images (object storage)	Geo with API/managed ²	SHA256 checksum ³
Container Repository	Container registry (file system)	Geo with API/Docker API	SHA256 checksum
Container Repository	Container registry (object storage)	Geo with API/Managed/Docker API ²	SHA256 checksum ³

Footnotes:

Redis replication can be used as part of HA with Redis sentinel. It’s not used between Geo sites.
Object storage replication can be performed by Geo or by your object storage provider/appliance native replication feature.
Object Storage verification is behind a feature flag, geo_object_storage_verification, introduced in 16.4 and enabled by default. It uses a checksum of the file size to verify the files.

Git repositories

A GitLab instance can have one or more repository shards. Each shard has a Gitaly instance that is responsible for allowing access and operations on the locally stored Git repositories. It can run on a machine:

With a single disk.
With multiple disks mounted as a single mount-point (like with a RAID array).
Using LVM.

GitLab does not require a special file system and can work with a mounted Storage Appliance. However, there can be performance limitations and consistency issues when using a remote file system.

Geo triggers garbage collection in Gitaly to deduplicate forked repositories on Geo secondary sites.

The Gitaly gRPC API does the communication, with three possible ways of synchronization:

Using regular Git clone/fetch from one Geo site to another (with special authentication).
Using repository snapshots (for when the first method fails or repository is corrupt).
Manual trigger from the Admin area (a combination of both of the above).

Each project can have at most 3 different repositories:

A project repository, where the source code is stored.
A wiki repository, where the wiki content is stored.
A design repository, where design artifacts are indexed (assets are actually in LFS).

They all live in the same shard and share the same base name with a -wiki and -design suffix for Wiki and Design Repository cases.

Besides that, there are snippet repositories. They can be connected to a project or to some specific user. Both types are synced to a secondary site.

Container repositories

Container repositories are stored in the container registry. They are a GitLab-specific concept built on top of a container registry as the datastore.

Blobs

GitLab stores files and blobs such as Issue attachments or LFS objects into either:

The file system in a specific location.
An Object Storage solution. Object Storage solutions can be:
- Cloud based like Amazon S3 and Google Cloud Storage.
- Hosted by you (like MinIO).
- A Storage Appliance that exposes an Object Storage-compatible API.

When using the file system store instead of Object Storage, use network mounted file systems to run GitLab when using more than one node.

With respect to replication and verification:

We transfer files and blobs using an internal API request.
With Object Storage, you can either:
- Use a cloud provider replication functionality.
- Have GitLab replicate it for you.

Databases

GitLab relies on data stored in multiple databases, for different use-cases. PostgreSQL is the single point of truth for user-generated content in the Web interface, like issues content, comments as well as permissions and credentials.

PostgreSQL can also hold some level of cached data like HTML-rendered Markdown and cached merge-requests diff. This can also be configured to be offloaded to object storage.

We use PostgreSQL’s own replication functionality to replicate data from the primary to secondary sites.

We use Redis both as a cache store and to hold persistent data for our background jobs system. Because both use-cases have data that are exclusive to the same Geo site, we don’t replicate it between sites.

Elasticsearch is an optional database for advanced search. It can improve search in both source-code level, and user generated content in issues, merge requests, and discussions. Elasticsearch is not supported in Geo.

Replicated data types

Replicated data types behind a feature flag

The replication for some data types is behind a corresponding feature flag:

History

Enable or disable replication (for some data types)

Replication for some data types are released behind feature flags that are enabled by default. GitLab administrators with access to the GitLab Rails console can opt to disable it for your instance. You can find feature flag names of each of those data types in the notes column of the table below.

To disable, such as for package file replication:

 Ruby Copy to clipboard  
Feature.disable(:geo_package_file_replication)

To enable, such as for package file replication:

 Ruby Copy to clipboard  
Feature.enable(:geo_package_file_replication)

Features not on this list, or with No in the Replicated column, are not replicated to a secondary site. Failing over without manually replicating data from those features causes the data to be lost. To use those features on a secondary site, or to execute a failover successfully, you must replicate their data using some other means.

Feature	Replicated (added in GitLab version)	Verified (added in GitLab version)	GitLab-managed object storage replication (added in GitLab version)	GitLab-managed object storage verification (added in GitLab version)	Notes
Application data in PostgreSQL	Yes (10.2)	Yes (10.2)	Not applicable	Not applicable
Project repository	Yes (10.2)	Yes (10.7)	Not applicable	Not applicable	Migrated to self-service framework in 16.2. See GitLab issue #367925 for more details. Behind feature flag `geo_project_repository_replication`, enabled by default in (16.3). All projects, including archived projects, are replicated.
Project wiki repository	Yes (10.2)²	Yes (10.7)²	Not applicable	Not applicable	Migrated to self-service framework in 15.11. See GitLab issue #367925 for more details. Behind feature flag `geo_project_wiki_repository_replication`, enabled by default in (15.11).
Group wiki repository	Yes (13.10)	Yes (16.3)	Not applicable	Not applicable	Behind feature flag `geo_group_wiki_repository_replication`, enabled by default.
Uploads	Yes (10.2)	Yes (14.6)	Yes (15.1)	Yes (16.4)³	Replication is behind the feature flag `geo_upload_replication`, enabled by default. Verification was behind the feature flag `geo_upload_verification`, removed in 14.8.
LFS objects	Yes (10.2)	Yes (14.6)	Yes (15.1)	Yes (16.4)³	GitLab versions 11.11.x and 12.0.x are affected by a bug that prevents any new LFS objects from replicating. Replication is behind the feature flag `geo_lfs_object_replication`, enabled by default. Verification was behind the feature flag `geo_lfs_object_verification`, removed in 14.7.
Personal snippets	Yes (10.2)	Yes (10.2)	Not applicable	Not applicable
Project snippets	Yes (10.2)	Yes (10.2)	Not applicable	Not applicable
CI job artifacts	Yes (10.4)	Yes (14.10)	Yes (15.1)	Yes (16.4)³	Verification is behind the feature flag `geo_job_artifact_replication`, enabled by default in 14.10.
CI Pipeline Artifacts	Yes (13.11)	Yes (13.11)	Yes (15.1)	Yes (16.4)³	Persists additional artifacts after a pipeline completes.
CI Secure Files	Yes (15.3)	Yes (15.3)	Yes (15.3)	Yes (16.4)³	Verification is behind the feature flag `geo_ci_secure_file_replication`, enabled by default in 15.3.
Container registry	Yes (12.3)¹	Yes (15.10)	Yes (12.3)¹	Yes (15.10)	See instructions to set up the container registry replication.
Terraform Module Registry	Yes (14.0)	Yes (14.0)	Yes (15.1)	Yes (16.4)³	Behind feature flag `geo_package_file_replication`, enabled by default.
Project designs repository	Yes (12.7)	Yes (16.1)	Yes (16.4)³	Yes (16.4)³	Designs also require replication of LFS objects and Uploads.
Package registry	Yes (13.2)	Yes (13.10)	Yes (15.1)	Yes (16.4)³	Behind feature flag `geo_package_file_replication`, enabled by default.
Versioned Terraform State	Yes (13.5)	Yes (13.12)	Yes (15.1)	Yes (16.4)³	Replication is behind the feature flag `geo_terraform_state_version_replication`, enabled by default. Verification was behind the feature flag `geo_terraform_state_version_verification`, which was removed in 14.0.
External merge request diffs	Yes (13.5)	Yes (14.6)	Yes (15.1)	Yes (16.4)³	Replication is behind the feature flag `geo_merge_request_diff_replication`, enabled by default. Verification was behind the feature flag `geo_merge_request_diff_verification`, removed in 14.7.
Versioned snippets	Yes (13.7)	Yes (14.2)	Yes (16.4)³	Yes (16.4)³	Verification was implemented behind the feature flag `geo_snippet_repository_verification` in 13.11, and the feature flag was removed in 14.2.
GitLab Pages	Yes (14.3)	Yes (14.6)	Yes (15.1)	Yes (16.4)³	Behind feature flag `geo_pages_deployment_replication`, enabled by default. Verification was behind the feature flag `geo_pages_deployment_verification`, removed in 14.7.
Project-level Secure files	Yes (15.3)	Yes (15.3)	Yes (15.3)	Yes (16.4)³
Incident Metric Images	Yes (15.5)	Yes (15.5)	Yes (15.5)	Yes (16.4)³	Replication/Verification is handled via the Uploads data type.
Alert Metric Images	Yes (15.5)	Yes (15.5)	Yes (15.5)	Yes (16.4)³	Replication/Verification is handled via the Uploads data type.
Server-side Git hooks	Not planned	No	Not applicable	Not applicable	Not planned because of current implementation complexity, low customer interest, and availability of alternatives to hooks.
Elasticsearch integration	Not planned	No	No	No	Not planned because further product discovery is required and Elasticsearch (ES) clusters can be rebuilt. Secondaries use the same ES cluster as the primary.
Dependency Proxy Images	Yes (15.7)	Yes (15.7)	Yes (15.7)	Yes (16.4)³
Vulnerability Export	Not planned	No	No	No	Not planned because they are ephemeral and sensitive information. They can be regenerated on demand.
Packages NPM metadata cache	Not planned	No	No	No	Not planned because it would not notably improve disaster recovery capabilities nor response times at secondary sites.

Footnotes:

Migrated to self-service framework in 15.5. See GitLab issue #337436 for more details.
Migrated to self-service framework in 15.11. Behind feature flag geo_project_wiki_repository_replication, enabled by default. See GitLab issue #367925 for more details.
Verification of files stored in object storage was introduced in GitLab 16.4 with a feature flag named geo_object_storage_verification, enabled by default.

Docs

Edit this page to fix an error or add an improvement in a merge request.

Create an issue to suggest an improvement to this page.

Product

Create an issue if there's something you don't like about this feature.

Propose functionality by submitting a feature request.

Feature availability and product trials

View pricing to see all GitLab tiers and features, or to upgrade.

Try GitLab for free with access to all features for 30 days.

Get help

If you didn't find what you were looking for, search the docs.

If you want help with something specific and could use community support, post on the GitLab forum.

For problems setting up or using this feature (depending on your GitLab subscription).

Request support