- The problem description
- How to select the proper level of acceleration?
- Upload encodings
- Uploading technologies
GitLab Workhorse has special rules for handling uploads. To prevent occupying a Ruby process on I/O operations, we process the upload in workhorse, where is cheaper. This process can also directly upload to object storage.
The following graph explains machine boundaries in a scalable GitLab installation. Without any workhorse optimization in place, we can expect incoming requests to follow the numbers on the arrows.
We have three challenges here: performance, availability, and scalability.
Rails process are expensive in terms of both CPU and memory. Ruby global interpreter lock adds to cost too because the Ruby process will spend time on I/O operations on step 3 causing incoming requests to pile up.
In order to improve this, disk buffered upload was implemented. With this, Rails no longer deals with writing uploaded files to disk.
There’s also an availability problem in this setup, NFS is a single point of failure.
To address this problem an HA object storage can be used and it’s supported by direct upload
Scaling NFS is outside of our support scope, and NFS is not a part of cloud native installations.
All features that require Sidekiq and do not use direct upload won’t work without NFS. In Kubernetes, machine boundaries translate to PODs, and in this case the uploaded file will be written into the POD private disk. Since Sidekiq POD cannot reach into other pods, the operation will fail to read it.
Selecting the proper acceleration is a tradeoff between speed of development and operational costs.
We can identify three major use-cases for an upload:
- storage: if we are uploading for storing a file (i.e. artifacts, packages, discussion attachments). In this case direct upload is the proper level as it’s the less resource-intensive operation. Additional information can be found on File Storage in GitLab.
- in-controller/synchronous processing: if we allow processing small files synchronously, using disk buffered upload may speed up development.
- Sidekiq/asynchronous processing: Async processing must implement direct upload, the reason being that it’s the only way to support Cloud Native deployments without a shared NFS.
For more details about currently broken feature see epic &1802.
Some features involves Git repository uploads without using a regular Git client. Some examples are uploading a repository file from the web interface and design management.
Those uploads requires the rails controller to act as a Git client in lieu of the user. Those operation falls into in-controller/synchronous processing category, but we have no warranties on the file size.
In case of a LFS upload, the file pointer is committed synchronously, but file upload to object storage is performed asynchronously with Sidekiq.
By upload encoding we mean how the file is included within the incoming request.
We have three kinds of file encoding in our uploads:
multipart/form-datais the most common, a file is encoded as a part of a multipart encoded request.
- body: some APIs uploads files as the whole request body.
- JSON: some JSON API uploads files as base64 encoded strings. This will require a change to GitLab Workhorse, which is planned.
By uploading technologies we mean how all the involved services interact with each other.
GitLab supports 3 kinds of uploading technologies, here follows a brief description with a sequence diagram for each one. Diagrams are not meant to be exhaustive.
This is the default kind of upload, and it’s most expensive in terms of resources.
In this case, workhorse is unaware of files being uploaded and acts as a regular proxy.
When a multipart request reaches the rails application,
Rack::Multipart leaves behind tempfiles in
/tmp and uses valuable Ruby process time to copy files around.
This kind of upload avoids wasting resources caused by handling upload writes to
/tmp in rails.
This optimization is not active by default on REST API requests.
When enabled, Workhorse looks for files in multipart MIME requests, uploading any it finds to a temporary file on shared storage. The MIME data in the request is replaced with the path to the corresponding file before it is forwarded to Rails.
To prevent abuse of this feature, Workhorse signs the modified request with a special header, stating which entries it modified. Rails will ignore any unsigned path entries.
and other metadata opt requires async processing r->>+redis: schedule a job redis-->>-r: job is scheduled end r-->>-c: request result deactivate c w->>-w: cleanup opt requires async processing activate sidekiq sidekiq->>+redis: fetch a job redis-->>-sidekiq: job sidekiq->>+s: read file s-->>-sidekiq: file sidekiq->>sidekiq: process file deactivate sidekiq end
This is the more advanced acceleration technique we have in place.
Workhorse asks rails for temporary pre-signed object storage URLs and directly uploads to object storage.
In this setup, an extra Rails route must be implemented in order to handle authorization. Examples of this can be found in:
note: this will fallback to disk buffered upload when
direct_upload is disabled inside the object storage setting.
The answer to the
/authorize call will only contain a file system path.
and other metadata r->>+os: move object to final destination os-->>-r: request result opt requires async processing r->>+redis: schedule a job redis-->>-r: job is scheduled end r-->>-c: request result deactivate c w->>-w: cleanup opt requires async processing activate sidekiq sidekiq->>+redis: fetch a job redis-->>-sidekiq: job sidekiq->>+os: get object os-->>-sidekiq: file sidekiq->>sidekiq: process file deactivate sidekiq end