---
title: Do you really know how to ORDER BY?
teaser: The perils of a seemingly harmless ORDER BY clause.
tags: rails,sql,postgresql
author: Thiago Araújo Silva
published_on: 2025-06-02
---

My goal with this post is to get you to ask a simple question every
time you write an SQL `ORDER BY`:

> **Is my sort order deterministic?**

## Why does that matter?

Because **non-deterministic sort orders can lead to flaky tests,
incorrect results, and subtle production bugs** — all of which are
avoidable.

## An anti-pattern example

Suppose you have the following test:

```rb
test "retrieves the user status from the last created user status" do
  create(
    :user,
    statuses: [
      build(:user_status, name: "pending"),
      build(:user_status, name: "onboarding"),
      build(:user_status, name: "active")
    ]
  )

  users = User.with_latest_status

  assert_equal "active", users.first.status_name
end
```

And the corresponding implementation:

```rb
class User < ApplicationRecord
  has_many :statuses, class_name: "UserStatus"

  def self.with_latest_status
    latest_status = UserStatus
      .where("user_id = users.id")
      .order(created_at: :desc)
      .limit(1)

    select("users.*, latest_status.name as status_name")
      .joins("LEFT JOIN LATERAL (#{latest_status.to_sql}) latest_status ON true")
  end
end
```

This `LATERAL JOIN` efficiently fetches the latest status for each user
by sorting statuses by `created_at DESC` and selecting the first row.

The test passes. You ship it. Two days later, your CI breaks with an
intermittent failure:

```
Failure:
UserTest#test_retrieves_the_user_status_from_the_last_created_status [test/models/user_test.rb:23]:
Expected: "active"
  Actual: "onboarding"
```

## What happened?

Occasionally, statuses (`onboarding` and `active` in our example) are
created with the same timestamp. Since `created_at` alone isn't
guaranteed to be unique, the database may return either row.

## A naive fix

You patch the test to avoid timestamp collisions by explicitly setting
`created_at`, and you also randomize record creation for good measure:

```rb
create(
  :user,
  statuses: [
    build(:user_status, name: "pending", created_at: 3.days.ago),
    build(:user_status, name: "active", created_at: 1.day.ago),
    build(:user_status, name: "onboarding", created_at: 2.days.ago)
  ]
)
```

Now the test passes consistently. Problem solved? Not quite.

## A new problem surfaces

Later, someone else adds a request test like this:

```rb
test "onboards the user" do
  VCR.use_cassette "onboards_the_user" do
    perform_enqueued_jobs do
      travel_to "2025-05-05" do
        # Imagine these steps here:
        #
        # - Sign up the user
        # - Onboard the user
        # - etc.

        get admin_user_path(user.id)

        assert_content "Status: Onboarded"
      end
    end
  end
end
```

This test relies on `travel_to` to freeze time for deterministic
behavior in background jobs and web requests.

It passes locally. But CI starts failing intermittently:

```
Failure:
UserOnboardingTest#test_onboards_the_user [test/requests/user_onboarding_test.rb:5]:
Expected: "Status: Onboarded"
  Actual: "Status: Onboarding"
```

Same issue. But this time, you can't easily control the `created_at`
timestamps because most of the status changes happen within a single
request.

## The real fix

The root issue is that the **sort order was never truly
deterministic**. `created_at` is not unique, so relying on it alone
introduces ambiguity.

The fix? Make your `ORDER BY` resolve ties explicitly:

```rb
def self.with_latest_status
  latest_status = UserStatus
    .where("user_id = users.id")
    .order(created_at: :desc, id: :desc)
    .limit(1)

  select("users.*, latest_status.name as status_name")
    .joins("LEFT JOIN LATERAL (#{latest_status.to_sql}) latest_status ON true")
end
```

By adding `id DESC` as a secondary sort field, we guarantee
deterministic behavior — even when multiple statuses share the same
`created_at`.

## Takeaways

- Always use a unique field in your `ORDER BY` if you want
  deterministic results. You _should_ always want deterministic
  results.

- If your primary sort field isn't unique (e.g., `created_at`), add a
  tie-breaker like `id` in a way that makes sense.

## Why not just use `id DESC`?

You might wonder: “Why not simplify and just sort by `id DESC`?”

Because it doesn’t align with the business rule.

Our rule is:

> The user's current status is the last created status.

"Last created" refers to creation time, not ID. Backfills and data
corrections might insert older statuses after newer ones, which can
result in older statuses having higher IDs. During a backfill, it
might be impossible to assign custom values to the ID field, which is
usually sequential and auto-generated. As a result, sorting by ID
could produce incorrect outcomes.

Sorting by `created_at DESC, id DESC` preserves intent while
maintaining correctness — even during manual fixups or data
migrations. And I'm sure breaking the business rule could create
additional problems.

## Other problems with non-deterministic sorts

Flaky tests are just the start. A common production issue is
pagination drift: If your sort order isn't stable, the same record
might appear on multiple pages, or vanish unexpectedly between
fetches.

It's also important to note that if your query does not include an
`ORDER BY` clause, you should always add one, unless you intend to
rely on the database's implicit and internal ordering. This implicit
ordering can depend on factors such as disk write order, which may
lead to unpredictable results. Without a defined sort order, you will
face the same uncertainties and potential errors.

## Wrapup

From now on, whenever you write an `ORDER BY`, ask yourself:

> Is my sort order deterministic?

If not, fix it — before it breaks something else. Possibly for someone
else.
