Uploads guide: Adding new uploads

Here, we describe how to add a new upload route accelerated by Workhorse.

Upload routes belong to one of these categories:

  1. Rails controllers: uploads handled by Rails controllers.
  2. Grape API: uploads handled by a Grape API endpoint.
  3. GraphQL API: uploads handled by a GraphQL resolve function.
caution
GraphQL uploads do not support direct upload. Depending on the use case, the feature may not work on installations without NFS (like GitLab.com or Kubernetes installations). Uploading to object storage inside the GraphQL resolve function may result in timeout errors. For more details, follow issue #280819.

Update Workhorse for the new route

For both the Rails controller and Grape API uploads, Workhorse must be updated to get the support for the new upload route.

  1. Open a new issue in the Workhorse tracker describing precisely the new upload route:
    • The route’s URL.
    • The upload encoding.
    • If possible, provide a dump of the upload request.
  2. Implement and get the MR merged for this issue above.
  3. Ask the Maintainers of Workhorse to create a new release. You can do that in the merge request directly during the maintainer review, or ask for it in the #workhorse Slack channel.
  4. Bump the Workhorse version file to the version you have from the previous points, or bump it in the same merge request that contains the Rails changes. Refer to Implementing the new route with a Rails controller or Implementing the new route with a Grape API endpoint below.

Implementing the new route with a Rails controller

For a Rails controller upload, we usually have a multipart/form-data upload and there are a few things to do:

  1. The upload is available under the parameter name you’re using. For example, it could be an artifact or a nested parameter such as user[avatar]. If you have the upload under the file parameter, reading params[:file] should get you an UploadedFile instance.
  2. Generally speaking, it’s a good idea to check if the instance is from the UploadedFile class. For example, see how we checked that the parameter is indeed an UploadedFile.
caution
Do not call UploadedFile#from_params directly! Do not build an UploadedFile instance using UploadedFile#from_params! This method can be unsafe to use depending on the params passed. Instead, use the UploadedFile instance that multipart.rb builds automatically for you.

Implementing the new route with a Grape API endpoint

For a Grape API upload, we can have a body or multipart upload. Things are slightly more complicated: two endpoints are needed. One for the Workhorse pre-upload authorization and one for accepting the upload metadata from Workhorse:

  1. Implement an endpoint with the URL + /authorize suffix that will:
    • Check that the request is coming from Workhorse with the require_gitlab_workhorse! from the API helpers.
    • Check user permissions.
    • Set the status to 200 with status 200.
    • Set the content type with content_type Gitlab::Workhorse::INTERNAL_API_CONTENT_TYPE.
    • Use your dedicated Uploader class (let’s say that it’s FileUploader) to build the response with FileUploader.workhorse_authorize(params).
  2. Implement the endpoint for the upload request that will:
    • Require all the UploadedFile objects as parameters.
      • For example, if we expect a single parameter file to be an UploadedFile instance, use requires :file, type: ::API::Validations::Types::WorkhorseFile.
      • Body upload requests have their upload available under the parameter file.
    • Check that the request is coming from Workhorse with the require_gitlab_workhorse! from the API helpers.
    • Check the user permissions.
    • The remaining code of the processing. In this step, the code must read the parameter. For our example, it would be params[:file].
caution
Do not call UploadedFile#from_params directly! Do not build an UploadedFile object using UploadedFile#from_params! This method can be unsafe to use depending on the params passed. Instead, use the UploadedFile object that multipart.rb builds automatically for you.

Document Object Storage buckets and CarrierWave integration

When using Object Storage, GitLab expects each kind of upload to maintain its own bucket in the respective Object Storage destination. Moreover, the integration with CarrierWave is not used all the time. The Object Storage Working Group is investigating an approach that unifies Object Storage buckets into a single one and removes CarrierWave so as to simplify implementation and administration of uploads.

Therefore, document new uploads here by slotting them into the following tables:

Feature bucket details

Feature Upload technology Uploader Bucket structure
Job artifacts direct upload workhorse /artifacts/<proj_id_hash>/<date>/<job_id>/<artifact_id>
Pipeline artifacts carrierwave sidekiq /artifacts/<proj_id_hash>/pipelines/<pipeline_id>/artifacts/<artifact_id>
Live job traces fog sidekiq /artifacts/tmp/builds/<job_id>/chunks/<chunk_index>.log
Job traces archive carrierwave sidekiq /artifacts/<proj_id_hash>/<date>/<job_id>/<artifact_id>/job.log
Autoscale runner caching N/A gitlab-runner /gitlab-com-[platform-]runners-cache/???
Backups N/A s3cmd, awscli, or gcs /gitlab-backups/???
Git LFS direct upload workhorse /lsf-objects/<lfs_obj_oid[0:2]>/<lfs_obj_oid[2:2]>
Design management files disk buffering rails controller /lsf-objects/<lfs_obj_oid[0:2]>/<lfs_obj_oid[2:2]>
Design management thumbnails carrierwave sidekiq /uploads/design_management/action/image_v432x230/<model_id>
Generic file uploads direct upload workhorse /uploads/@hashed/[0:2]/[2:4]/<hash1>/<hash2>/file
Generic file uploads - personal snippets direct upload workhorse /uploads/personal_snippet/<snippet_id>/<filename>
Global appearance settings disk buffering rails controller /uploads/appearance/...
Topics disk buffering rails controller /uploads/projects/topic/...
Avatar images direct upload workhorse /uploads/[user,group,project]/avatar/<model_id>
Import/export direct upload workhorse /uploads/import_export_upload/???
GitLab Migration carrierwave sidekiq /uploads/bulk_imports/???
MR diffs carrierwave sidekiq /external-diffs/merge_request_diffs/mr-<mr_id>/diff-<diff_id>
Package manager archives direct upload sidekiq /packages/<proj_id_hash>/packages/<pkg_segment>/files/<pkg_file_id>
Package manager archives direct upload sidekiq /packages/<container_id_hash>/debian_*_component_file/<component_file_id>
Package manager archives direct upload sidekiq /packages/<container_id_hash>/debian_*_distribution/<distribution_file_id>
Container image cache (?) direct upload workhorse /dependency-proxy/<group_id_hash>/dependency_proxy/<group_id>/files/<proxy_id>/<blob_id or manifest_id>
Terraform state files carrierwave rails controller /terraform/<proj_id_hash>/<terraform_state_id>
Pages content archives carrierwave sidekiq /gitlab-gprd-pages/<proj_id_hash>/pages_deployments/<deployment_id>/
Secure Files carrierwave sidekiq /ci-secure-files/<proj_id_hash>/secure_files/<secure_file_id>/

CarrierWave integration

File CarrierWave usage Categorized
app/models/project.rb include Avatarable Yes
app/models/projects/topic.rb include Avatarable Yes
app/models/group.rb include Avatarable Yes
app/models/user.rb include Avatarable Yes
app/models/terraform/state_version.rb include FileStoreMounter Yes
app/models/ci/job_artifact.rb include FileStoreMounter Yes
app/models/ci/pipeline_artifact.rb include FileStoreMounter Yes
app/models/pages_deployment.rb include FileStoreMounter Yes
app/models/lfs_object.rb include FileStoreMounter Yes
app/models/dependency_proxy/blob.rb include FileStoreMounter Yes
app/models/dependency_proxy/manifest.rb include FileStoreMounter Yes
app/models/packages/composer/cache_file.rb include FileStoreMounter Yes
app/models/packages/package_file.rb include FileStoreMounter Yes
app/models/concerns/packages/debian/component_file.rb include FileStoreMounter Yes
ee/app/models/issuable_metric_image.rb include FileStoreMounter  
ee/app/models/vulnerabilities/remediation.rb include FileStoreMounter  
ee/app/models/vulnerabilities/export.rb include FileStoreMounter  
app/models/packages/debian/project_distribution.rb include Packages::Debian::Distribution Yes
app/models/packages/debian/group_distribution.rb include Packages::Debian::Distribution Yes
app/models/packages/debian/project_component_file.rb include Packages::Debian::ComponentFile Yes
app/models/packages/debian/group_component_file.rb include Packages::Debian::ComponentFile Yes
app/models/merge_request_diff.rb mount_uploader :external_diff, ExternalDiffUploader Yes
app/models/note.rb mount_uploader :attachment, AttachmentUploader Yes
app/models/appearance.rb mount_uploader :logo, AttachmentUploader Yes
app/models/appearance.rb mount_uploader :header_logo, AttachmentUploader Yes
app/models/appearance.rb mount_uploader :favicon, FaviconUploader Yes
app/models/project.rb mount_uploader :bfg_object_map, AttachmentUploader  
app/models/import_export_upload.rb mount_uploader :import_file, ImportExportUploader Yes
app/models/import_export_upload.rb mount_uploader :export_file, ImportExportUploader Yes
app/models/ci/deleted_object.rb mount_uploader :file, DeletedObjectUploader  
app/models/design_management/action.rb mount_uploader :image_v432x230, DesignManagement::DesignV432x230Uploader Yes
app/models/concerns/packages/debian/distribution.rb mount_uploader :signed_file, Packages::Debian::DistributionReleaseFileUploader Yes
app/models/bulk_imports/export_upload.rb mount_uploader :export_file, ExportUploader Yes
ee/app/models/user_permission_export_upload.rb mount_uploader :file, AttachmentUploader  
app/models/ci/secure_file.rb include FileStoreMounter