Foreign keys and associations contribute

When adding an association to a model you must also add a foreign key. When adding a foreign key you must always add an index first.

If the index must be created async due to duration reasons, you must avoid adding the foreign key until the index gets created. To ensure data consistency, you must avoid using the new association in rails until the foreign key and index both get created.

For example, say you have the following model:

 Ruby Copy to clipboard  
class User < ActiveRecord::Base
  has_many :posts
end

Add a foreign key here on column posts.user_id. This ensures that data consistency is enforced on database level. Foreign keys also mean that the database can very quickly remove associated data (for example, when removing a user), instead of Rails having to do this.

Adding foreign keys in migrations

Foreign keys can be added concurrently using add_concurrent_foreign_key as defined in Gitlab::Database::MigrationHelpers. See the Migration Style Guide for more information.

Keep in mind that you can only safely add foreign keys to existing tables after you have removed any orphaned rows. The method add_concurrent_foreign_key does not take care of this so you must do so manually. See adding foreign key constraint to an existing column.

Use `bigint` for foreign keys

When adding a new foreign key, you should define it as bigint. Even if the referenced table has an integer primary key type, you must reference the new foreign key as bigint. As we are migrating all primary keys to bigint, using bigint foreign keys saves time, and requires fewer steps, when migrating the parent table to bigint primary keys.

Consider `reverse_lock_order`

Consider using reverse_lock_order for high traffic tables Both add_concurrent_foreign_key and remove_foreign_key_if_exists take a boolean option reverse_lock_order which defaults to false.

You can read more about the context for this in the the original issue.

This can be useful where we have known queries that are also acquiring locks (usually row locks) on the same tables at a high frequency.

Consider, for example, the scenario where you want to add a foreign key like:

 SQL Copy to clipboard  
ALTER TABLE ONLY todos
    ADD CONSTRAINT fk_91d1f47b13 FOREIGN KEY (note_id) REFERENCES notes(id) ON DELETE CASCADE;

And consider the following hypothetical application code:

 Ruby Copy to clipboard  
Todo.transaction do
   note = Note.create(...)
   # Observe what happens if foreign key is added here!
   todo = Todo.create!(note_id: note.id)
end

If you try to create the foreign key in between the 2 insert statements we can end up with a deadlock on both transactions in Postgres. Here is how it happens:

Note.create: acquires a row lock on notes
ALTER TABLE ... acquires a table lock on todos
ALTER TABLE ... FOREIGN KEY attempts to acquire a table lock on notes but this blocks on the other transaction which has a row lock
Todo.create attempts to acquire a row lock on todos but this blocks on the other transaction which has a table lock on todos

This illustrates how both transactions can be stuck waiting for each other to finish and they will both timeout. We normally have transaction retries in our migrations so it is usually OK but the application code might also timeout and there might be an error for that user. If this application code is running very frequently it’s possible that we will be constantly timing out the migration and users may also be regularly getting errors.

The deadlock case with removing a foreign key is similar because it also acquires locks on both tables but a more common scenario, using the example above, would be a DELETE FROM notes WHERE id = .... This query will acquire a lock on notes followed by a lock on todos and the exact same deadlock described above can happen. For this reason it’s almost always best to use reverse_lock_order for removing a foreign key.

Updating foreign keys in migrations

Sometimes a foreign key constraint must be changed, preserving the column but updating the constraint condition. For example, moving from ON DELETE CASCADE to ON DELETE SET NULL or vice-versa.

PostgreSQL does not prevent you from adding overlapping foreign keys. It honors the most recently added constraint. This allows us to replace foreign keys without ever losing foreign key protection on a column.

To replace a foreign key:

Add the new foreign key:

 Ruby Copy to clipboard  
class ReplaceFkOnPackagesPackagesProjectId < Gitlab::Database::Migration[2.1]
  disable_ddl_transaction!

  NEW_CONSTRAINT_NAME = 'fk_new'

  def up
    add_concurrent_foreign_key(:packages_packages, :projects, column: :project_id, on_delete: :nullify, name: NEW_CONSTRAINT_NAME)
  end

  def down
    with_lock_retries do
      remove_foreign_key_if_exists(:packages_packages, column: :project_id, on_delete: :nullify, name: NEW_CONSTRAINT_NAME)
    end
  end
end

Remove the old foreign key:

 Ruby Copy to clipboard  
class RemoveFkOld < Gitlab::Database::Migration[2.1]
  disable_ddl_transaction!

  OLD_CONSTRAINT_NAME = 'fk_old'

  def up
    with_lock_retries do
      remove_foreign_key_if_exists(:packages_packages, column: :project_id, on_delete: :cascade, name: OLD_CONSTRAINT_NAME)
    end
  end

  def down
    add_concurrent_foreign_key(:packages_packages, :projects, column: :project_id, on_delete: :cascade, name: OLD_CONSTRAINT_NAME)
  end
end

Cascading deletes

Every foreign key must define an ON DELETE clause, and in 99% of the cases this should be set to CASCADE.

Indexes

When adding a foreign key in PostgreSQL the column is not indexed automatically, thus you must also add a concurrent index. Indexes are required for all foreign keys and they must be added before the foreign key. This can mean that they are an earlier step in the same migration or they are added in an earlier migration than the migration adding the foreign key. For the same reasons, foreign keys must be removed before removing indexes supporting these foreign keys.

Without an index on the foreign key it forces Postgres to do a full table scan every time a record is deleted from the referenced table. In the past this has led to incidents where deleting projects and namespaces times out.

It is also ok to have a composite index which covers this foreign key so long as the foreign key is in the first position of the composite index. For example if you have a foreign key project_id then it is OK to have a composite index like BTREE (project_id, user_id) but it is not OK to have an index like BTREE (user_id, project_id). The latter does not allow efficient lookups by project_id alone and therefore would not prevent the cascade deletes from timing out. Partial indexes like BTREE (project_id) WHERE user_id IS NULL can never be used for cascading deletes and are not OK for serving as an index for the foreign key.

Naming foreign keys

By default Ruby on Rails uses the _id suffix for foreign keys. So we should only use this suffix for associations between two tables. If you want to reference an ID on a third party platform the _xid suffix is recommended.

The spec spec/db/schema_spec.rb tests if all columns with the _id suffix have a foreign key constraint. So if that spec fails, don’t add the column to IGNORED_FK_COLUMNS, but instead add the FK constraint, or consider naming it differently.

Dependent removals

Don’t define options such as dependent: :destroy or dependent: :delete when defining an association. Defining these options means Rails handles the removal of data, instead of letting the database handle this in the most efficient way possible.

In other words, this is bad and should be avoided at all costs:

 Ruby Copy to clipboard  
class User < ActiveRecord::Base
  has_many :posts, dependent: :destroy
end

Should you truly have a need for this it should be approved by a database specialist first.

You should also not define any before_destroy or after_destroy callbacks on your models unless absolutely required and only when approved by database specialists. For example, if each row in a table has a corresponding file on a file system it may be tempting to add a after_destroy hook. This however introduces non database logic to a model, and means we can no longer rely on foreign keys to remove the data as this would result in the file system data being left behind. In such a case you should use a service class instead that takes care of removing non database data.

In cases where the relation spans multiple databases you have even further problems using dependent: :destroy or the above hooks. You can read more about alternatives at Avoid dependent: :nullify and dependent: :destroy across databases.

Alternative primary keys with `has_one` associations

Sometimes a has_one association is used to create a one-to-one relationship:

 Ruby Copy to clipboard  
class User < ActiveRecord::Base
  has_one :user_config
end

class UserConfig < ActiveRecord::Base
  belongs_to :user
end

In these cases, there may be an opportunity to remove the unnecessary id column on the associated table, user_config.id in this example. Instead, the originating table ID can be used as the primary key for the associated table:

 Ruby Copy to clipboard  
create_table :user_configs, id: false do |t|
  t.references :users, primary_key: true, default: nil, index: false, foreign_key: { on_delete: :cascade }
  ...
end

Setting default: nil ensures a primary key sequence is not created, and because the primary key automatically gets an index, we set index: false to avoid creating a duplicate. You must also add the new primary key to the model:

 Ruby Copy to clipboard  
class UserConfig < ActiveRecord::Base
  self.primary_key = :user_id

  belongs_to :user
end

Using a foreign key as primary key saves space but can make batch counting in Service Ping less efficient. Consider using a regular id column if the table is relevant for Service Ping.

Docs

Edit this page to fix an error or add an improvement in a merge request.

Create an issue to suggest an improvement to this page.

Product

Create an issue if there's something you don't like about this feature.

Propose functionality by submitting a feature request.

Feature availability and product trials

View pricing to see all GitLab tiers and features, or to upgrade.

Try GitLab for free with access to all features for 30 days.

Get help

If you didn't find what you were looking for, search the docs.

If you want help with something specific and could use community support, post on the GitLab forum.

For problems setting up or using this feature (depending on your GitLab subscription).

Request support