Moving repositories managed by GitLab
- Tier: Free, Premium, Ultimate
- Offering: GitLab Self-Managed
Move all repositories managed by GitLab to another file system or another server.
Move data in a GitLab instance
Use the GitLab API to move Git repositories:
- Between servers.
- Between different storages.
- From single-node Gitaly to Gitaly Cluster (Praefect).
GitLab repositories can be associated with projects, groups, and snippets. Each of these types has a separate API for moving the repositories. To move all repositories on a GitLab instance, each of type of repository must be moved for each storage.
Each repository is made read-only for the duration of the move and is not writable until the move is finished.
To move repositories:
- Ensure all local and cluster storages are accessible to the GitLab instance. In
this example, these are
<original_storage_name>
and<cluster_storage_name>
. - Configure repository storage weights so that the new storages receives all new projects. This stops new projects from being created on existing storages while the migration is in progress.
- Schedule repository moves for projects, snippets, and group.
- If you use Geo, resync all repositories.
Move projects
You can move all projects or individual projects.
To move all projects by using the API:
Schedule repository storage moves for all projects on a storage shard using the API. For example:
curl --request POST --header "Private-Token: <your_access_token>" \ --header "Content-Type: application/json" \ --data '{"source_storage_name":"<original_storage_name>","destination_storage_name":"<cluster_storage_name>"}' \ "https://gitlab.example.com/api/v4/project_repository_storage_moves"
Query the most recent repository moves using the API. The response indicates either:
- The moves have completed successfully. The
state
field isfinished
. - The moves are in progress. Re-query the repository move until it completes successfully.
- The moves have failed. Most failures are temporary and are solved by rescheduling the move.
- The moves have completed successfully. The
After the moves are complete, use the API to query projects and confirm that all projects have moved. None of the projects should be returned with the
repository_storage
field set to the old storage. For example:curl --header "Private-Token: <your_access_token>" --header "Content-Type: application/json" \ "https://gitlab.example.com/api/v4/projects?repository_storage=<original_storage_name>"
Alternatively, use the Rails console to confirm that all projects have moved:
ProjectRepository.for_repository_storage('<original_storage_name>')
Repeat for each storage as required.
If you don’t want to move all projects, follow the instructions for moving individual projects.
Move snippets
You can move all snippets or individual snippets.
To move all snippets by using the API:
Schedule repository storage moves for all snippets on a storage shard. For example:
curl --request POST --header "PRIVATE-TOKEN: <your_access_token>" \ --header "Content-Type: application/json" \ --data '{"source_storage_name":"<original_storage_name>","destination_storage_name":"<cluster_storage_name>"}' \ "https://gitlab.example.com/api/v4/snippet_repository_storage_moves"
Query the most recent repository moves. The response indicates either:
- The moves have completed successfully. The
state
field isfinished
. - The moves are in progress. Re-query the repository move until it completes successfully.
- The moves have failed. Most failures are temporary and are solved by rescheduling the move.
- The moves have completed successfully. The
After the moves are complete, use the Rails console to confirm that all snippets have moved:
SnippetRepository.for_repository_storage('<original_storage_name>')
The command should not return snippets for the original storage.
Repeat for each storage as required.
If you don’t want to move all snippets, follow the instructions for individual snippets.
Move groups
- Tier: Premium, Ultimate
- Offering: GitLab Self-Managed
You can move all groups or individual groups.
To move all groups by using the API:
Schedule repository storage moves for all groups on a storage shard. For example:
curl --request POST --header "PRIVATE-TOKEN: <your_access_token>" \ --header "Content-Type: application/json" \ --data '{"source_storage_name":"<original_storage_name>","destination_storage_name":"<cluster_storage_name>"}' \ "https://gitlab.example.com/api/v4/group_repository_storage_moves"
Query the most recent repository moves. The response indicates either:
- The moves have completed successfully. The
state
field isfinished
. - The moves are in progress. Re-query the repository move until it completes successfully.
- The moves have failed. Most failures are temporary and are solved by rescheduling the move.
- The moves have completed successfully. The
After the moves are complete, use the Rails console to confirm that all groups have moved:
GroupWikiRepository.for_repository_storage('<original_storage_name>')
The command should not return groups for the original storage.
Repeat for each storage as required.
If you don’t want to move all groups, follow the instructions for individual groups.
Migrate to another GitLab instance
You can’t move data by using the API if you are migrating to a new GitLab environment. For example:
- From a single-node GitLab to a scaled-out architecture.
- From a GitLab instance in your private data center to a cloud provider.
In this case, there are ways you can copy all your repositories from /var/opt/gitlab/git-data/repositories
to
/mnt/gitlab/repositories
depending on the scenario:
- The target directory is empty.
- The target directory contains an outdated copy of the repositories.
- When you have thousands of repositories.
Each of the approaches can or does overwrite data in the target directory /mnt/gitlab/repositories
. You must correctly
specify the source and the target.
Use backup and restore (recommended)
For either Gitaly or Gitaly Cluster (Praefect) targets, you should use the GitLab
backup and restore capability. Git repositories are accessed, managed, and stored on
GitLab servers by Gitaly as a database. You can experience data loss if you directly access and copy Gitaly files using
tools like rsync
. You can:
- Improve backup performance by processing multiple repositories concurrently.
- Create backups of just the repositories by using the skip feature.
You must use the back up and restore method for Gitaly Cluster (Praefect) targets.
Use tar
You can use a tar
pipe to move repositories if:
- You specify Gitaly targets and not Gitaly Cluster targets.
- The target directory
/mnt/gitlab/repositories
is empty.
This method has low overhead and tar
is usually pre-installed on your system. However, you cannot resume an
interrupted tar
pipe. If tar
is interrupted, you must empty the target directory and copy all the data again.
To see progress of the tar
process, replace -xf
with -xvf
.
sudo -u git sh -c 'tar -C /var/opt/gitlab/git-data/repositories -cf - -- . |\
tar -C /mnt/gitlab/repositories -xf -'
Use a tar
pipe to another server
For Gitaly targets, you can use a tar
pipe to copy data to another server. If your git
user has SSH access to the
new server as git@<newserver>
, you can pipe the data through SSH.
If you want to compress the data before it goes over the network (which increases CPU usage) you can replace
ssh
with ssh -C
.
sudo -u git sh -c 'tar -C /var/opt/gitlab/git-data/repositories -cf - -- . |\
ssh git@newserver tar -C /mnt/gitlab/repositories -xf -'
Use rsync
You can use a rsync
to move repositories if:
- You specify Gitaly targets and not Gitaly Cluster targets.
- The target directory already contains a partial or outdated copy of the repositories, which means copying all the data
again with
tar
is inefficient.
You must use the --delete
option when using rsync
. Using rsync
without --delete
can cause data loss and
repository corruption. For more information, see issue 270422.
The /.
in the following command is very important, otherwise you can get the wrong directory structure in the target
directory. If you want to see progress, replace -a
with -av
.
sudo -u git sh -c 'rsync -a --delete /var/opt/gitlab/git-data/repositories/. \
/mnt/gitlab/repositories'
Use rsync
to another server
For Gitaly targets, you can send the repositories over the network with rsync
if the git
user on your source system
has SSH access to the target server.
sudo -u git sh -c 'rsync -a --delete /var/opt/gitlab/git-data/repositories/. \
git@newserver:/mnt/gitlab/repositories'