Troubleshooting GitLab chart development environment contribute

All steps noted here are for DEVELOPMENT ENVIRONMENTS ONLY. Administrators may find the information insightful, but the outlined fixes are destructive and would have a major negative impact on production systems.

Passwords and secrets failing or unsynchronized

Developers commonly deploy, delete, and re-deploy a release into the same cluster multiple times. Kubernetes secrets and persistent volume claims created by StatefulSets are intentionally not removed by helm delete RELEASE_NAME.

Removing only the Kubernetes secrets leads to interesting problems. For example, a new deployment’s migration pod will fail because GitLab Rails cannot connect to the database because it has the wrong password.

To completely wipe a release from a development environment including secrets, a developer must remove both the secrets and the persistent volume claims.

 Shell Copy to clipboard  
# DO NOT run these commands in a production environment. Disaster will strike.
kubectl delete secrets,pvc -lrelease=RELEASE_NAME

This deletes all Kubernetes secrets including TLS certificates and all data in the database. This should not be performed in a production instance.

Database is broken and needs reset

The database environment can be reset in a development environment by:

Delete the PostgreSQL StatefulSet
Delete the PostgreSQL PersistentVolumeClaim
Deploy GitLab again with helm upgrade --install

This will delete all data in the databases and should not be run in production.

CI clusters are low on available resources

You may notice one or more CI clusters run low on available resources like CPU and memory. Our clusters are configured to automatically scale the available nodes, but sometimes we hit the upper limit and therefore no more nodes can be created. In this case, a good first step is to see if any installations of the GitLab Helm Charts in the clusters can be removed.

Installations are usually cleaned up automatically by the Review Apps logic in the pipeline, but this can fail for various reasons. See the following issues for more details:

As a workaround, these installations can be manually deleted by running the associated stop_review job(s) in CI. To make this easier, use the helm_ci_triage.sh script to get a list of running installations and open the associated pipeline to run the stop_review job(s). Further usage details are available in the script.

Docs

Edit this page to fix an error or add an improvement in a merge request.

Create an issue to suggest an improvement to this page.

Product

Create an issue if there's something you don't like about this feature.

Propose functionality by submitting a feature request.

Feature availability and product trials

View pricing to see all GitLab tiers and features, or to upgrade.

Try GitLab for free with access to all features for 30 days.

Get help

If you didn't find what you were looking for, search the docs.

If you want help with something specific and could use community support, post on the GitLab forum.

For problems setting up or using this feature (depending on your GitLab subscription).

Request support