Efficient IN operator queries

This document describes a technique for building efficient ordered database queries with the IN SQL operator and the usage of a GitLab utility module to help apply the technique.

note
The described technique makes heavy use of keyset pagination. It’s advised to get familiar with the topic first.

Motivation

In GitLab, many domain objects like Issue live under nested hierarchies of projects and groups. To fetch nested database records for domain objects at the group-level, we often perform queries with the IN SQL operator. We are usually interested in ordering the records by some attributes and limiting the number of records using ORDER BY and LIMIT clauses for performance. Pagination may be used to fetch subsequent records.

Example tasks requiring querying nested domain objects from the group level:

  • Show first 20 issues by creation date or due date from the group gitlab-org.
  • Show first 20 merge requests by merged at date from the group gitlab-com.

Unfortunately, ordered group-level queries typically perform badly as their executions require heavy I/O, memory, and computations. Let’s do an in-depth examination of executing one such query.

Performance problems with IN queries

Consider the task of fetching the twenty oldest created issues from the group gitlab-org with the following query:

SELECT "issues".*
FROM "issues"
WHERE "issues"."project_id" IN
    (SELECT "projects"."id"
     FROM "projects"
     WHERE "projects"."namespace_id" IN
         (SELECT traversal_ids[array_length(traversal_ids, 1)] AS id
          FROM "namespaces"
          WHERE (traversal_ids @> ('{9970}'))))
ORDER BY "issues"."created_at" ASC,
         "issues"."id" ASC
LIMIT 20
note
For pagination, ordering by the created_at column is not enough, we must add the id column as a tie-breaker.

The execution of the query can be largely broken down into three steps:

  1. The database accesses both namespaces and projects tables to find all projects from all groups in the group hierarchy.
  2. The database retrieves issues records for each project causing heavy disk I/O. Ideally, an appropriate index configuration should optimize this process.
  3. The database sorts the issues rows in memory by created_at and returns LIMIT 20 rows to the end-user. For large groups, this final step requires both large memory and CPU resources.

Execution plan for this DB query:

 Limit  (cost=90170.07..90170.12 rows=20 width=1329) (actual time=967.597..967.607 rows=20 loops=1)
   Buffers: shared hit=239127 read=3060
   I/O Timings: read=336.879
   ->  Sort  (cost=90170.07..90224.02 rows=21578 width=1329) (actual time=967.596..967.603 rows=20 loops=1)
         Sort Key: issues.created_at, issues.id
         Sort Method: top-N heapsort  Memory: 74kB
         Buffers: shared hit=239127 read=3060
         I/O Timings: read=336.879
         ->  Nested Loop  (cost=1305.66..89595.89 rows=21578 width=1329) (actual time=4.709..797.659 rows=241534 loops=1)
               Buffers: shared hit=239121 read=3060
               I/O Timings: read=336.879
               ->  HashAggregate  (cost=1305.10..1360.22 rows=5512 width=4) (actual time=4.657..5.370 rows=1528 loops=1)
                     Group Key: projects.id
                     Buffers: shared hit=2597
                     ->  Nested Loop  (cost=576.76..1291.32 rows=5512 width=4) (actual time=2.427..4.244 rows=1528 loops=1)
                           Buffers: shared hit=2597
                           ->  HashAggregate  (cost=576.32..579.06 rows=274 width=25) (actual time=2.406..2.447 rows=265 loops=1)
                                 Group Key: namespaces.traversal_ids[array_length(namespaces.traversal_ids, 1)]
                                 Buffers: shared hit=334
                                 ->  Bitmap Heap Scan on namespaces  (cost=141.62..575.63 rows=274 width=25) (actual time=1.933..2.330 rows=265 loops=1)
                                       Recheck Cond: (traversal_ids @> '{9970}'::integer[])
                                       Heap Blocks: exact=243
                                       Buffers: shared hit=334
                                       ->  Bitmap Index Scan on index_namespaces_on_traversal_ids  (cost=0.00..141.55 rows=274 width=0) (actual time=1.897..1.898 rows=265 loops=1)
                                             Index Cond: (traversal_ids @> '{9970}'::integer[])
                                             Buffers: shared hit=91
                           ->  Index Only Scan using index_projects_on_namespace_id_and_id on projects  (cost=0.44..2.40 rows=20 width=8) (actual time=0.004..0.006 rows=6 loops=265)
                                 Index Cond: (namespace_id = (namespaces.traversal_ids)[array_length(namespaces.traversal_ids, 1)])
                                 Heap Fetches: 51
                                 Buffers: shared hit=2263
               ->  Index Scan using index_issues_on_project_id_and_iid on issues  (cost=0.57..10.57 rows=544 width=1329) (actual time=0.114..0.484 rows=158 loops=1528)
                     Index Cond: (project_id = projects.id)
                     Buffers: shared hit=236524 read=3060
                     I/O Timings: read=336.879
 Planning Time: 7.750 ms
 Execution Time: 967.973 ms
(36 rows)

The performance of the query depends on the number of rows in the database. On average, we can say the following: