Rails’ ActiveRecord provides a comprehensive interface for querying the database. Unchecked and without proper processes in place, it can become unwieldy as the domain changes.
The Setup
Imagine an application domain where a team of people publishes a technical blog.
class Person < ApplicationRecord
has_many :posts
end
class Post < ApplicationRecord
belongs_to :author, class_name: "Person"
end
In addition to the author
association and other standard post data
attributes, the Post
model contains a boolean flag named published
.
A Rails controller showing the newest published posts might look like:
class PostsController < ApplicationController
def index
@newest_posts = Post.where(published: true).order(created_at: :desc).limit(10)
end
end
Let’s go one step further, where we create a page dedicated to the list of published authors:
class AuthorsController < ApplicationController
def index
@published_authors = Person.distinct.joins(:posts).where(posts: { published: true })
end
end
The Pain
A new feature comes in where teammates want to enqueue posts to be published in the future.
This could be modeled by adjusting the published
boolean to a published_at
timestamp that allows for three states:
- unpublished (
published_at
is set tonil
) - published (
published_at
is set to a timestamp less than or equal to now) - enqueued (
published_at
is set to a timestamp in the future)
While this is a relatively small change in the database and corresponding migration (which we won’t go into here), the necessary changes across these different controllers represent a code smell, Shotgun Surgery.
While this example is small, in larger codebases, changes like this can add up to a sizeable PR quickly. Most often, changes associated with this shift in data include:
- controllers
- service objects
- query objects
- jobs
- factories
- tests (especially acceptance tests or anything that touches the database)
The Underlying Issue
The underlying issue here is that ActiveRecord can act as a leaky abstraction.
By nature of it abstracting over a database with direct references to columns,
in combination with the ability to use where
either directly on Post
within
a controller (or even worse, reaching through an association to find
published authors in the second controller example), we’re littering
information about how a post is considered published (the contents of the
where
clause) in a few different files within the application (currently, the
model and two separate controllers).
The Suggested Fix
While this approach is dependent on the complexity of the queries, I’d first
lean on a class method on Post
:
class Post < ApplicationRecord
# other methods
def self.published
where("published_at < ?", Time.current)
end
end
With this, changes to the controllers are trivial:
class PostsController < ApplicationController
def index
- @newest_posts = Post.where(published: true).order(created_at: :desc).limit(10)
+ @newest_posts = Post.published.order(created_at: :desc).limit(10)
end
end
class AuthorsController < ApplicationController
def index
- @published_authors = Person.distinct.joins(:posts).where(posts: { published: true })
+ @published_authors = Person.distinct.joins(:posts).merge(Post.published)
end
end
Is there still coupling at the controller level between a person and their
corresponding posts? Yep! Adjusting that setup, however, seems more appropriate
to be a breaking change, where the notion of a post being published should
hold, generally speaking, whether we’re using a published
boolean, a
published_at
timestamp, or some sort of state machine.
Worth highlighting in this second change is the merge
method, which handles
all the heavy lifting of merging the Post.published
query with the
Person.distinct
query.
Caveats and Considerations
In working with larger applications, use of where
is not the only indicator
from ActiveRecord, nor is it always problematic. where
with associations, for
example, falls into the “coupling association” category, which is usually
innocuous.
It’s also worth noting that where
use being problematic is not only bound to
Rails controllers; service objects, jobs, and other areas of the application
querying against the “guts” of an ActiveRecord object are susceptible.
Finally, while we used a class method in the example above, for larger queries, consider a dedicated query object to encapsulate logic in the appropriate spots.