Configure storage for the GitLab chart

Tier: Free, Premium, Ultimate Offering: GitLab Self-Managed

The following applications within the GitLab chart require persistent storage to maintain state.

  • Gitaly (persists the Git repositories)
  • PostgreSQL (persists the GitLab database data)
  • Redis (persists GitLab job data)
  • MinIO (persists the object storage data)

The administrator may choose to provision this storage using dynamic or static volume provisioning.

Important: Minimize extra storage migration tasks after installation through pre-planning. Changes made after the first deployment require manual edits to existing Kubernetes objects prior to running helm upgrade.

Typical Installation Behavior

The installer creates storage using the default storage class and dynamic volume provisioning. Applications connect to this storage through a Persistent Volume Claim. Administrators are encouraged to use dynamic volume provisioning instead of static volume provisioning when it is available.

Administrators should determine the default storage class in their production environment using kubectl get storageclass and then examine it using kubectl describe storageclass *STORAGE_CLASS_NAME*. Some providers, such as Amazon EKS, do not provide a default storage class.

Configuring Cluster Storage

Recommendations

The default storage class should:

  • Use fast SSD storage when available
  • Set reclaimPolicy to Retain

Uninstalling GitLab without the reclaimPolicy set to Retain allows automated jobs to completely delete the volume, disk and data. Some platforms set the default reclaimPolicy to Delete. The gitaly persistent volume claims do not follow this rule because they belong to a StatefulSet.

Minimal Storage Class Configurations

The following YAML configurations provide the bare minimum required to create a custom storage class for GitLab. Replace CUSTOM_STORAGE_CLASS_NAME with a value appropriate for the target installation environment.

Some users report that Amazon EKS exhibits behavior where the creation of nodes are not always in the same zone as the pods. Setting the zone parameter above will mitigate any risk.

Using the Custom Storage Class

Set the custom storage class to the cluster default and it will be used for all dynamic provisioning.

kubectl patch storageclass CUSTOM_STORAGE_CLASS_NAME -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'

Alternatively, the custom storage class and other options may be provided per service to Helm during installation. View the provided example configuration file and modify for your environment.

helm install -upgrade gitlab gitlab/gitlab -f HELM_OPTIONS_YAML_FILE

Follow the links below for further reading and additional persistence options:

Note: Some of the advanced persistence options differ between PostgreSQL and the others, so it’s important to check the specific documentation for each before making changes.

Using Static Volume Provisioning

Dynamic volume provisioning is recommended, however, some clusters or environments may not support it. Administrators will need to create the Persistent Volume manually.

Using Google GKE

  1. Create a persistent disk in the cluster.
gcloud compute disks create --size=50GB --zone=*GKE_ZONE* *DISK_VOLUME_NAME*
  1. Create the Persistent Volume after modifying the example YAML configuration.
kubectl create -f *PV_YAML_FILE*

Using Amazon EKS

note
If you need to deploy in multiple zones, you should review Amazon’s own documentation on storage classes when defining your storage solution.
  1. Create a persistent disk in the cluster.
aws ec2 create-volume --availability-zone=*AWS_ZONE* --size=10 --volume-type=gp2
  1. Create the Persistent Volume after modifying the example YAML configuration.
kubectl create -f *PV_YAML_FILE*

Manually creating PersistentVolumeClaims

The Gitaly service deploys using a StatefulSet. Create the PersistentVolumeClaim using the following naming convention for it to be properly recognized and used.

<mount-name>-<statefulset-pod-name>

The mount-name for Gitaly is repo-data. The StatefulSet pod names are created using:

<statefulset-name>-<pod-index>

The GitLab chart determines the statefulset-name using:

<chart-release-name>-<service-name>

The correct name for the Gitaly PersistentVolumeClaim is: repo-data-gitlab-gitaly-0.

Note: If using Praefect with multiple Virtual Storages, you will need one PersistentVolumeClaim per Gitaly replica per Virtual Storage defined. For example, if you have default and vs2 Virtual Storages defined, each with 2 replicas, then you need the following PersistentVolumeClaims:

  • repo-data-gitlab-gitaly-default-0
  • repo-data-gitlab-gitaly-default-1
  • repo-data-gitlab-gitaly-vs2-0
  • repo-data-gitlab-gitaly-vs2-1

Modify the example YAML configuration for your environment and reference it when invoking helm.

The other services that do not use a StatefulSet allow administrators to provide the volumeName to the configuration. This chart will still take care of creating the volume claim and attempt to bind to the manually created volume. Check the chart documentation for each included application.

For most cases, just modify the example YAML configuration keeping only those services which will use the manually created disk volumes.

Making changes to storage after installation

After the initial installation, storage changes like migrating to new volumes, or changing disk sizes, require editing the Kubernetes objects outside of the Helm upgrade command.

See the managing persistent volumes documentation.

Optional volumes

For larger installations, you may need to add persistent storage to the Toolbox to get backups/restores working. See our troubleshooting documentation for a guide on how to do this.