- What’s a flaky test?
- Quarantined tests
- Automatic retries and flaky tests detection
- Problems we had in the past at GitLab
It’s a test that sometimes fails, but if you retry it enough times, it passes, eventually.
When a test frequently fails in
a ~”master:broken” issue
should be created.
If the test cannot be fixed in a timely fashion, there is an impact on the
productivity of all the developers, so it should be placed in quarantine by
This means it will be skipped unless run with
bin/rspec --tag quarantine
Before putting a test in quarantine, you should make sure that a ~”master:broken” issue exists for it so it won’t stay in quarantine forever.
Once a test is in quarantine, there are 3 choices:
- Should the test be fixed (i.e. get rid of its flakiness)?
- Should the test be moved to a lower level of testing?
- Should the test be removed entirely (e.g. because there’s already a lower-level test, or it’s duplicating another same-level test, or it’s testing too much etc.)?
Quarantined tests are run on the CI in dedicated jobs that are allowed to fail:
rspec-pg-quarantine(CE & EE)
We also use a home-made
RspecFlaky::Listener listener which records flaky
examples in a JSON report file on
This was originally implemented in: https://gitlab.com/gitlab-org/gitlab-foss/-/merge_requests/13021.
If you want to enable retries locally, you can use the
RETRIES env variable.
RETRIES=1 bin/rspec ... would retry the failing examples once.
rspec-retryis biting us when some API specs fail: https://gitlab.com/gitlab-org/gitlab-foss/-/merge_requests/9825
Sporadic RSpec failures due to
- FFaker generates funky data that tests are not ready to handle (and tests should be predictable so that’s bad!):
spec/mailers/notify_spec.rbmore robust: https://gitlab.com/gitlab-org/gitlab-foss/-/merge_requests/10015
Transient failure in
- Replace FFaker factory data with sequences: https://gitlab.com/gitlab-org/gitlab-foss/-/merge_requests/10184
- Transient failure in spec/finders/issues_finder_spec.rb: https://gitlab.com/gitlab-org/gitlab-foss/-/merge_requests/10404
- Be sure to create all the data the test need before starting exercise: https://gitlab.com/gitlab-org/gitlab-foss/-/merge_requests/12059
- Bis: https://gitlab.com/gitlab-org/gitlab-foss/-/merge_requests/12604
- Bis: https://gitlab.com/gitlab-org/gitlab-foss/-/merge_requests/12664
- Assert against the underlying database state instead of against a page’s content: https://gitlab.com/gitlab-org/gitlab-foss/-/merge_requests/10934
- In JS tests, shifting elements can cause Capybara to misclick when the element moves at the exact time Capybara sends the click
- Triggering JS events before the event handlers are set up
- Wait for the image to be lazy-loaded when asserting on a Markdown image’s src attribute
- Transient failure of spec/features/issues/filtered_search/filter_issues_spec.rb: https://gitlab.com/gitlab-org/gitlab-foss/-/merge_requests/10411
- Don’t wait for AJAX when no AJAX request is fired: https://gitlab.com/gitlab-org/gitlab-foss/-/merge_requests/10454
- Bis: https://gitlab.com/gitlab-org/gitlab-foss/-/merge_requests/12626
- Memory is through the roof! (TL;DR: Load images but block images requests!): https://gitlab.com/gitlab-org/gitlab-foss/-/merge_requests/12003
- Test imports a project (via Sidekiq) that is growing over time, leading to timeouts when the import takes longer than 60 seconds
- Flaky Tests: Are You Sure You Want to Rerun Them?
- How to Deal With and Eliminate Flaky Tests
- Tips on Treating Flakiness in your Rails Test Suite
- ‘Flaky’ tests: a short story
- Using Insights to Discover Flaky, Slow, and Failed Tests