---
title: 3 ways to model an associated collection in Rails
teaser: A look at the Postgres array type and good old join tables, each with their
  advantages and disadvantages.
tags: rails,databases,sql,postgresql
author: Dimiter Petrov
published_on: 2023-11-23
---

Imagine we're modeling an amusement park. It has attractions and guests.

We start with this database migration:

```ruby
class InitialSchema < ActiveRecord::Migration[7.1]
  def change
    create_table :attractions do |t|
      t.string :name, null: false
      t.timestamps
    end

    create_table :guests do |t|
      t.timestamps
    end
  end
end
```

Guests have to wait in line before going on an attraction.

What are some fundamental properties of this? I can think of these:

1. A guest can only physically queue at a single attraction.
2. A guest can only be in the queue once.

As a last constraint, let's say that our amusement park has only one queue per attraction.

Let's see different ways we can represent the queue in a Rails application.

## An array column

PosgreSQL has an [array type][pg-array-type] and it is [supported by ActiveRecord][active-record-pg-array].

[pg-array-type]: https://www.postgresql.org/docs/current/arrays.html
[active-record-pg-array]: https://guides.rubyonrails.org/active_record_postgresql.html#array

```ruby
class CreateQueuedGuests < ActiveRecord::Migration[7.1]
  def change
    add_column :attractions, :queued_guests, :bigint, array: true, null: false, default: []
  end
end
```

There are two notable instructions. One is that the column type is an array of `bigint`s. That is because we want to store the IDs of guest records.

The other notable option is the default value of `[]`, i.e. empty array. Without specifying a default value, the, uh, default is `null`. You cannot add elements to `null`/`nil`:

```ruby
nil << 2
# => undefined method `<<' for nil:NilClass (NoMethodError)
```

But you can add elements to an empty array:

```ruby
[] << 2
# => [2]
```

In our case adding guests to the end of the queue is a common operation and we don't want to think about initializing the array before adding the first item.

```ruby
class AttractionTest < ActiveSupport::TestCase
  test "adding guests to the queue" do
    guest1 = Guest.create!
    guest2 = Guest.create!
    attraction = Attraction.create!(name: "Water ride")

    attraction.queued_guests << guest1.id
    attraction.queued_guests << guest2.id

    assert_equal [guest1.id, guest2.id], attraction.queued_guests
  end
end
```

What's more, our rule of "an attraction has only one queue" is better served by an _empty_ queue (`[]`) than an non-existent queue (`null`/`nil`).

Sometimes guests change their mind after waiting some time and they want to go to another attraction. How do we remove them from the queue? We use the `delete` method like we would for any array. A quick test to illustrate:

```ruby
test "removing a guest from the queue" do
  guest = Guest.create!
  attraction = Attraction.create!(name: "Ferris wheel")
  attraction.queued_guests << guest.id
  assert_equal [guest.id], attraction.queued_guests

  attraction.queued_guests.delete(guest.id)

  assert_equal [], attraction.reload.queued_guests
end
```

We demonstrated adding and removing elements from the array column, but we've ignored the rules we'd agreed on at the start.
Remember that we said we're modeling standing in a single queue. We could check this in a test like the following:

```ruby
test "a guest can only queue at one attraction" do
  guest1 = Guest.create!
  guest2 = Guest.create!
  attraction1 = Attraction.create!(name: "Water ride")
  attraction2 = Attraction.create!(name: "Ferris wheel")

  attraction1.queue_guest(guest1)
  attraction1.queue_guest(guest1)
  attraction1.queue_guest(guest2)
  attraction2.queue_guest(guest2)

  assert_equal [guest1.id, guest2.id], attraction1.queued_guests
  assert_equal [], attraction2.queued_guests
end
```

As far as I know, PostgreSQL can't help us with uniqueness constraints when we're using an array column. We need to check by ourselves if the guest has queue already before adding them to a queue:

```ruby
class Attraction < ApplicationRecord
  def queue_guest(guest)
    Attraction.transaction do
      existing_queued_guests = Attraction.lock.all.pluck(:queued_guests).flatten
      if existing_queued_guests.none? { |id| id == guest.id }
        queued_guests << guest.id
        save
      end
    end
  end
end
```

Notice my attempt at [locking][locks] all records before updating. That is because if the same code is running in two processes simultaneously, one of them might insert the record right after the one has checked that the record does not exist. I don't even think the code is correct and I am not sure how to do it properly.

[locks]: https://api.rubyonrails.org/classes/ActiveRecord/Locking/Pessimistic.html

It also seems like we are doing a lot of manual work to do something at which the database is better.

## A table for queues and a join table

This time we represent a queue as a row in a `guest_queues` table instead of a column on in the `attractions` table.

```ruby
create_table :guest_queues do |t|
  t.references :attraction, null: false, foreign_key: true, index: true
  t.timestamps
end
```

<aside class"info">
We name the table <code>guest_queues</code> and the model <code>GuestQueue</code>, because <code>Queue</code> is already a class in Ruby's <code>thread</code> library.
</aside>

We also introduce a join table between the queue and guests. I'll call it `queue_membership`. It represents the fact of belonging to a queue. Adding a person is a matter of creating a `QueueMembership`. Removing a person of the queue is done by deleting the `QueueMembership`.

```ruby
create_table :queue_memberships do |t|
  t.references :guest_queue, null: false, foreign_key: true, index: true
  t.references :guest, null: false, foreign_key: true, index: {unique: true}
  t.timestamps
  t.index :created_at
  t.index [:guest_queue_id, :guest_id], unique: true
end
```

The uniqueness constraints enforce our rules from the start: a guest can only be in a single queue and they appear at most once in a queue.

The queue order was previously provided by the array. Now we can use the order of arrival. This is why added a database index to the `created_at` column in the migration.

```ruby
class GuestQueue < ApplicationRecord
  has_many :queue_memberships, -> { order(:created_at) }
  has_many :guests, through: :queue_memberships
end
```

Now that we've named the guest queue and queue membership concepts, here's how we could add and remove guests:

```ruby
class QueueMembershipsController < ApplicationController
  def create
    guest_queue = GuestQueue.find_or_create_by!(
      attraction_id: params.fetch(:attraction_id)
    )
    guest_queue.queue_memberships.find_or_create_by!(
      guest_id: params.fetch(:guest_id)
    )
  end

  def destroy
    QueueMembership.find_by(id: params[:id])&.destroy
  end
end
```

An annoying detail becomes apparent in this code: a queue record must exist before we add a guest to it. Here we've chosen to create it on demand. However, one could also accidentally create more than one queue, say through a bug in that `GuestQueue.find_or_create_by` call.

How can we make it so we only have a single queue and we don't need to create it manually?

## Just a join table

That `guest_queues` table in the previous example is not doing much work for us. It points to an attraction. Guests are queueing in the queue of the attraction. We can skip the indirection and say they are queueing at the attraction. In code that looks like the following.

```ruby
create_table :queue_memberships do |t|
  t.references :attraction, null: false, foreign_key: true, index: true
  t.references :guest, null: false, foreign_key: true, index: {unique: true}
  t.timestamps
  t.index :created_at
  t.index [:attraction_id, :guest_id], unique: true
end
```

The attraction can retrieve the `queued_guests` directly, much like in the first example where that was an array column.

```ruby
class Attraction < ApplicationRecord
  has_many :queue_memberships, -> { order(:created_at) }
  has_many :queued_guests, through: :queue_memberships, source: :guest
end

class QueueMembership < ApplicationRecord
  belongs_to :attraction
  belongs_to :guest
end
```

Now, the controller is also shorter:

```ruby
class QueueMembershipsController < ApplicationController
  def create
    QueueMembership.create!(
      attraction_id: params.fetch(:attraction_id),
      guest_id: params.fetch(:guest_id)
    )
  end

  def destroy
    QueueMembership.find_by(id: params[:id])&.destroy
  end
end
```

## Do we need a table for the collection?

We've seen that it's possible to represent the queue without a `queues` table, so why would we add one?

### When `has_one` becomes `has_many`

The amusement park management might decide to introduce a more expensive tickets that gives you access to an exclusive second queue with higher priority. Now we need to model multiple queues per attraction and perhaps give a name or type to each queue.

### When you want to store other properties of the collection

Above, the `GuestQueue` model didn't hold any information except the associated attraction. Say we want to model the maximum number of guests that can comfortably fit in the queue. A new `capacity` column on the `guest_queues` table would do the job well, because each queue has a different capacity.
