My goal with this post is to get you to ask a simple question every
time you write an SQL ORDER BY
:
Is my sort order deterministic?
Why does that matter?
Because non-deterministic sort orders can lead to flaky tests, incorrect results, and subtle production bugs — all of which are avoidable.
An anti-pattern example
Suppose you have the following test:
test "retrieves the user status from the last created user status" do
create(
:user,
statuses: [
build(:user_status, name: "pending"),
build(:user_status, name: "onboarding"),
build(:user_status, name: "active")
]
)
users = User.with_latest_status
assert_equal "active", users.first.status_name
end
And the corresponding implementation:
class User < ApplicationRecord
has_many :statuses, class_name: "UserStatus"
def self.with_latest_status
latest_status = UserStatus
.where("user_id = users.id")
.order(created_at: :desc)
.limit(1)
select("users.*, latest_status.name as status_name")
.joins("LEFT JOIN LATERAL (#{latest_status.to_sql}) latest_status ON true")
end
end
This LATERAL JOIN
efficiently fetches the latest status for each user
by sorting statuses by created_at DESC
and selecting the first row.
The test passes. You ship it. Two days later, your CI breaks with an intermittent failure:
Failure:
UserTest#test_retrieves_the_user_status_from_the_last_created_status [test/models/user_test.rb:23]:
Expected: "active"
Actual: "onboarding"
What happened?
Occasionally, statuses (onboarding
and active
in our example) are
created with the same timestamp. Since created_at
alone isn’t
guaranteed to be unique, the database may return either row.
A naive fix
You patch the test to avoid timestamp collisions by explicitly setting
created_at
, and you also randomize record creation for good measure:
create(
:user,
statuses: [
build(:user_status, name: "pending", created_at: 3.days.ago),
build(:user_status, name: "active", created_at: 1.day.ago),
build(:user_status, name: "onboarding", created_at: 2.days.ago)
]
)
Now the test passes consistently. Problem solved? Not quite.
A new problem surfaces
Later, someone else adds a request test like this:
test "onboards the user" do
VCR.use_cassette "onboards_the_user" do
perform_enqueued_jobs do
travel_to "2025-05-05" do
# Imagine these steps here:
#
# - Sign up the user
# - Onboard the user
# - etc.
get admin_user_path(user.id)
assert_content "Status: Onboarded"
end
end
end
end
This test relies on travel_to
to freeze time for deterministic
behavior in background jobs and web requests.
It passes locally. But CI starts failing intermittently:
Failure:
UserOnboardingTest#test_onboards_the_user [test/requests/user_onboarding_test.rb:5]:
Expected: "Status: Onboarded"
Actual: "Status: Onboarding"
Same issue. But this time, you can’t easily control the created_at
timestamps because most of the status changes happen within a single
request.
The real fix
The root issue is that the sort order was never truly
deterministic. created_at
is not unique, so relying on it alone
introduces ambiguity.
The fix? Make your ORDER BY
resolve ties explicitly:
def self.with_latest_status
latest_status = UserStatus
.where("user_id = users.id")
.order(created_at: :desc, id: :desc)
.limit(1)
select("users.*, latest_status.name as status_name")
.joins("LEFT JOIN LATERAL (#{latest_status.to_sql}) latest_status ON true")
end
By adding id DESC
as a secondary sort field, we guarantee
deterministic behavior — even when multiple statuses share the same
created_at
.
Takeaways
Always use a unique field in your
ORDER BY
if you want deterministic results. You should always want deterministic results.If your primary sort field isn’t unique (e.g.,
created_at
), add a tie-breaker likeid
in a way that makes sense.
Why not just use id DESC
?
You might wonder: “Why not simplify and just sort by id DESC
?”
Because it doesn’t align with the business rule.
Our rule is:
The user’s current status is the last created status.
“Last created” refers to creation time, not ID. Backfills and data corrections might insert older statuses after newer ones, which can result in older statuses having higher IDs. During a backfill, it might be impossible to assign custom values to the ID field, which is usually sequential and auto-generated. As a result, sorting by ID could produce incorrect outcomes.
Sorting by created_at DESC, id DESC
preserves intent while
maintaining correctness — even during manual fixups or data
migrations. And I’m sure breaking the business rule could create
additional problems.
Other problems with non-deterministic sorts
Flaky tests are just the start. A common production issue is pagination drift: If your sort order isn’t stable, the same record might appear on multiple pages, or vanish unexpectedly between fetches.
It’s also important to note that if your query does not include an
ORDER BY
clause, you should always add one, unless you intend to
rely on the database’s implicit and internal ordering. This implicit
ordering can depend on factors such as disk write order, which may
lead to unpredictable results. Without a defined sort order, you will
face the same uncertainties and potential errors.
Wrapup
From now on, whenever you write an ORDER BY
, ask yourself:
Is my sort order deterministic?
If not, fix it — before it breaks something else. Possibly for someone else.