Bitbucket Server importer developer documentation

Prerequisites

To test imports, you need a Bitbucket Server instance running locally. For information on running a local instance, see these instructions.

Code structure

The importer’s codebase is broken up into the following directories:

lib/gitlab/bitbucket_server_import: this directory contains most of the code such as the classes used for importing resources.
app/workers/gitlab/bitbucket_server_import: this directory contains the Sidekiq workers.

How imports advance

When a Bitbucket Server project is imported, work is divided into separate stages, with each stage consisting of a set of Sidekiq jobs that are executed.

Between every stage, a job called Gitlab::BitbucketServerImport::AdvanceStageWorker is scheduled that periodically checks if all work of the current stage is completed. If all the work is complete, the job advances the import process to the next stage.

Stages

1. Stage::ImportRepositoryWorker

This worker imports the repository and schedules the next stage when done.

2. Stage::ImportPullRequestsWorker

This worker imports all pull requests. For every pull request, a job for the Gitlab::BitbucketImport::ImportPullRequestWorker worker is scheduled.

Bitbucket Server keeps tracks of references for open pull requests in refs/heads/pull-requests, but closed and merged requests get moved into hidden internal refs under stash-refs/pull-requests.

As a result, they are not fetched by default. To prevent merge requests from not having commits and therefore having empty diffs, we fetch affected source and target commits from the server before importing the pull request. We save the fetched commits as refs so that Git doesn’t remove them, which can happen if they are unused. Source commits are saved as #{commit}:refs/merge-requests/#{pull_request.iid}/head and target commits are saved as #{commit}:refs/keep-around/#{commit}.

When creating a pull request, we need to match Bitbucket users with GitLab users for the author and reviewers. Whenever a matching user is found, the GitLab user ID is cached for 24 hours so that it doesn’t have to be searched for again.

3. Stage::ImportNotesWorker

This worker imports notes (comments) for all merge requests.

For every merge request, a job for the Gitlab::BitbucketServerImport::ImportPullRequestNotesWorker worker is scheduled which imports all standalone comments, inline comments, merge events, and approved events for the merge request.

4. Stage::ImportLfsObjectsWorker

Imports LFS objects from the source project by scheduling a Gitlab::BitbucketServerImport::ImportLfsObjectsWorker job for every LFS object.

5. Stage::FinishImportWorker

This worker completes the import process by performing some housekeeping such as marking the import as completed.

Pull request mentions

Pull request descriptions and notes can contain @mentions to users. If a user with the same email does not exist on GitLab, this can lead to incorrect users being tagged.

To get around this, we build a cache containing all users who have access to the Bitbucket project and then convert mentions in pull request descriptions and notes.

Backoff and retry

In order to handle rate limiting, requests are wrapped with BitbucketServer::RetryWithDelay. This wrapper checks if the response is rate limited and retries once after the delay specified in the response headers.