Testing Techniques: The I/O Table

If you work with database systems, I bet you’ve written a SQL query to retrieve rows in some specific order at least once in your career. Moreover, you may even have applied an ORDER BY with two or more columns, so how would you go about testing that kind of query?

Let’s take a look at a common attempt in Ruby:

it "fetches appointments sorted by date_time asc, provider first name asc, and city asc" do
  freeze_time do
    location1 = create(:location, city: "Atlanta")
    location2 = create(:location, city: "Nevada")

    provider_z = create(:provider, first_name: "Zed")
    provider_a = create(:provider, first_name: "Albert")

    appointment2 = create(:appointment, date_time: 5.hours.from_now)
    appointment3a = create(:appointment, date_time: 2.days.from_now)
    appointment1 = create(:appointment, date_time: 1.hour.from_now)
    appointment3d = create(
      :appointment,
      date_time: 2.days.from_now + 1.hour,
      provider: provider_z,
      location: location2
    )
    appointment3c = create(
      :appointment,
      date_time: 2.days.from_now + 1.hour,
      provider: provider_z,
      location: location1
    )
    appointment3b = create(
      :appointment,
      date_time: 2.days.from_now + 1.hour,
      provider: provider_a,
      location: location1
    )

    expected_sort_order = [
      appointment1,
      appointment2,
      appointment3a,
      appointment3b,
      appointment3c,
      appointment3d
    ]

    expect(AppointmentsQuery.call).to eq expected_sort_order
  end
end

In this example, we are creating a few records at the top, and at the bottom we expect the appointments to come out in the order specified by the test description.

This test works, but it has some serious shortcomings – and that’s what we will discuss next.

The three traits of a good test

A good test may have further traits, but here we will focus on the three main ones:

Correctness
Readability
Maintainability

Correctness

Is that test correct? In my opinion, it is. After reading it for a while you will probably notice that the appointments are shuffled to avoid false positives. If we had created them in the same order we expected them to come out, and if our SELECT query had omitted the ORDER BY clause, our test could still have passed! That’s because in most situations they will come out in the same order they were created. However, the sort order is unspecified and may depend, among other factors, on the order in which the rows were created on disk.

Also, we are freezing the time to avoid equal dates from differing by milliseconds, since we are working with relative dates. Otherwise, we’d introduce flakiness and random failures that would compromise the reliability of our test suite.

While I might trust this test to ensure that my query is working correctly, it is very difficult to understand, which lowers my confidence. It’s also very easy to break because of its complexity. Which leads us to Readability!

Readability

The appointment variables denote an explicit sense of ordering, which may become clear after looking at the expected_sort_order array.

If we pay even more attention, we will notice that appointment3a, appointment3b, appointment3c, and appointment3d are named that way because they group four records under the same date umbrella (2.hours.from_now + 1.hour). Hence, 1, 2, and 3 designate three different groups of dates, while a, b, c, and d designate the expected order within the third group.

However, you may have noticed the existence of an implicit group of rows in the third group of dates, since our ORDER BY has three columns. Should we name our variables appointment3a0, appointment3a1, etc, to account for the third column? Maybe not! I’m sure that would get rather messy, so the tradeoff is not worth it.

Now, is that test readable? Maybe just a little. But it’s complicated, easy to misunderstand, and not visually appealing. What I mean is that the setup is lengthy and there’s a lot of noise with the variables and in the test setup, which gets in the way of understanding what is actually being tested. Thereby, readability is impaired.

Maintainability

“Maintainability” is where the code example certainly gets the lowest grade. Given the following test failure, would we actually know what is wrong with the code and how to fix the problem?

1) AppointmentsQuery fetches appointments sorted by date_time asc, provider first name asc, and city asc
   Failure/Error: expect(AppointmentsQuery.call).to eq expected_sort_order

     expected: [#<Appointment id: 81, date_time: "2022-03-25 20:16:50.000000000 +0000", provider_id: 70, location_id...intment id: 82, date_time: "2022-03-27 20:16:50.000000000 +0000", provider_id: 66, location_id: 67>]
          got: #<ActiveRecord::Relation [#<Appointment id: 84, date_time: "2022-03-27 20:16:50.000000000 +0000", pro...ntment id: 81, date_time: "2022-03-25 20:16:50.000000000 +0000", provider_id: 70, location_id: 70>]>

     (compared using ==)

     Diff:
     @@ -1,7 +1,31 @@
     -[#<Appointment id: 81, date_time: "2022-03-25 20:16:50.000000000 +0000", provider_id: 70, location_id: 70>,
     - #<Appointment id: 79, date_time: "2022-03-26 00:16:50.000000000 +0000", provider_id: 68, location_id: 68>,
     - #<Appointment id: 80, date_time: "2022-03-27 19:16:50.000000000 +0000", provider_id: 69, location_id: 69>,
     - #<Appointment id: 84, date_time: "2022-03-27 20:16:50.000000000 +0000", provider_id: 67, location_id: 66>,
     - #<Appointment id: 83, date_time: "2022-03-27 20:16:50.000000000 +0000", provider_id: 66, location_id: 66>,
     - #<Appointment id: 82, date_time: "2022-03-27 20:16:50.000000000 +0000", provider_id: 66, location_id: 67>]
     +[#<Appointment:0x00007fe0adf708b0
     +  id: 84,
     +  date_time: Sun, 27 Mar 2022 20:16:50.000000000 UTC +00:00,
     +  provider_id: 67,
     +  location_id: 66>,
     + #<Appointment:0x00007fe0adf70770
     +  id: 83,
     +  date_time: Sun, 27 Mar 2022 20:16:50.000000000 UTC +00:00,
     +  provider_id: 66,
     +  location_id: 66>,
     + #<Appointment:0x00007fe0adf706a8

Output omitted for brevity…

We would probably have a hard time debugging that failure, and it would get even worse due to the relative dates and the foreign key mystery guests. If we were testing a complex query where the sorting is just a small part of the logic, good luck figuring that out!

And what if we wanted to introduce another appointment in our setup to be the second expected record? We would have to rename all of the appointment variables downstream: appointment2 to appointment3, appointment3a to appointment4a, and so on.

Finally, any changes to the code as it stands are likely to introduce hard-to-debug issues, or worse, have the test pass even when the underlying SQL is broken

So, is this test maintainable? I believe it is not.

Back to basics: input and output (I/O)

Let’s get down to the basics. What are we actually testing here? Not the appointment objects, that’s for sure. We are actually sending data to a store beyond the boundaries of our application and retrieving it back, which means we’re dealing with side effects in discrete steps. Since we are not coupled to the format of any particular objects, that makes our testing environment even more controllable for the table formats we’re aiming to achieve down below. And most importantly: it’s just input and output.

Let’s organize our shuffled input as an input table:

date_time	first_name	city
2022-03-22 15:00	Zyler	Texas
2022-03-24 10:00	Zyler	Texas
2022-03-22 11:00	Zyler	Texas
2022-03-24 11:00	Zed	Nevada
2022-03-24 11:00	Zed	Atlanta
2022-03-24 11:00	Albert	Atlanta

Given our input table, it’s pretty easy to sort our records so let’s recall our desired sort order:

First, by date_time ASC;
Second, by first_name ASC;
Third, by city ASC.

Given that information, here’s our sorted output table:

date_time	first_name	city
2022-03-22 11:00	Zyler	Texas
2022-03-22 15:00	Zyler	Texas
2022-03-24 10:00	Zyler	Texas
2022-03-24 11:00	Albert	Atlanta
2022-03-24 11:00	Zed	Atlanta
2022-03-24 11:00	Zed	Nevada

Couldn’t our tables have been written down with pen and paper?

Along with the description of our test, that’s pretty straightforward and much easier to undestand. How might we write our test as close as possible to that?

Refactoring our test

Since our goal is for our test to read exactly like a table of inputs and outputs, we will skip the intermediate steps that led us to the end result:

def create_appointments(rows)
  rows.each do |(date_time, first_name, city)|
    create(
      :appointment,
      date_time: date_time,
      location: create(:location, city: city),
      provider: create(:provider, first_name: first_name)
    )
  end
end

def appointments_query_result
  AppointmentsQuery.call.map do |appointment|
    [
      appointment.date_time.strftime("%Y-%m-%d %H:%M"),
      appointment.provider.first_name,
      appointment.location.city
    ]
  end
end

it "fetches appointments sorted by date_time asc, provider first name asc, and city asc" do
  create_appointments(
    [
      ["2022-03-22 15:00", "Zyler",  "Texas"],
      ["2022-03-24 10:00", "Zyler",  "Texas"],
      ["2022-03-22 11:00", "Zyler",  "Texas"],
      ["2022-03-24 11:00", "Zed",    "Nevada"],
      ["2022-03-24 11:00", "Zed",    "Atlanta"],
      ["2022-03-24 11:00", "Albert", "Atlanta"]
    ]
  )

  expect(appointments_query_result).to eq(
    [
      ["2022-03-22 11:00", "Zyler",  "Texas"],
      ["2022-03-22 15:00", "Zyler",  "Texas"],
      ["2022-03-24 10:00", "Zyler",  "Texas"],
      ["2022-03-24 11:00", "Albert", "Atlanta"],
      ["2022-03-24 11:00", "Zed",    "Atlanta"],
      ["2022-03-24 11:00", "Zed",    "Nevada"]
    ]
  )
end

That’s much better looking than our first attempt! We are even aligning the matrix items so that they read like tables. I wouldn’t do that with variable assignments, but our intent here is that it reads like a table, literally.

Note that we don’t even need to specify the name of our columns because in this case, they are self-revealing; it’s very noticeable that the first column holds date times, that “Albert” is a name, and that “Atlanta” is a city. Otherwise, we could have resorted to vertically layed-out hashes or introduced a comment at the top of our table to serve as a header.

Finally, we no longer need to freeze the time because we are using absolute dates as pure data rather than relative dates.

Takeaways

Here are a few takeaways¹ of our refactoring:

There are no convoluted names or mystery guests;
The helper functions remove a lot of the noise of repeated object construction;
The tabular layouts make reading everything easier;
Expecting raw data instead of variable names has three huge benefits:
- You can clearly see why the expected sort order is what it is;
- You can quickly identify missing test cases;
- You can understand the test failure messages without looking at a wall of object arrays.

Most of the takeaways also apply to the I/O table technique in general!

I really recommend this approach when you’re dealing with tabular data and side effects, but in no way should you limit yourself to that. Generally, I would recommend not being afraid to use raw data in your tests, as data duplication is usually not a big deal because clarity is more important than DRY when it comes to tests.

When I originally paired with a fellow developer on this code, Louis, this was how he summarized the benefits of transforming his code into this I/O Table format. ↩