We recently came across an interesting problem, which was discussed in a previous post, Postgres Window Functions:
We want to get each post’s three most recent comments.
As discussed, you can’t use simple eager loading:
Post.order(created_at: :desc).limit(5).includes(:comments)
This will load every comment for each post. When there are many comments per post, this quickly becomes unacceptable.
Starting Slow
It’s frequently easiest to start with a slow implementation and make it faster when necessary. In many cases, the slower (and possibly simpler) implementation will work just fine, and it’s best to deploy it as-is. Let’s look at a naive implementation of our list of posts and comments:
class PostsController < ApplicationController
  def index
    @posts = Post.order(created_at: :desc).limit(5)
  end
end
class Post < ActiveRecord::Base
  has_many :comments, dependent: :destroy
  def latest_comments
    comments.order(created_at: :desc).limit(3)
  end
end
This will frequently do fine, but causes N+1 queries, which will look like this in your log:
    Started GET "/" for 127.0.0.1 at 2014-09-18 11:36:18 -0400
    Processing by PostsController#index as HTML
      Post Load (0.4ms)  SELECT "posts".* FROM "posts"
        ORDER BY "posts"."created_at" DESC LIMIT 5
      Comment Load (0.2ms)  SELECT "comments".* FROM "comments"
        WHERE "comments"."post_id" = $1
        ORDER BY "comments"."created_at" DESC
        LIMIT 3  [["post_id", 27]]
      Comment Load (0.3ms)  SELECT "comments".* FROM "comments"
        WHERE "comments"."post_id" = $1
        ORDER BY "comments"."created_at" DESC
        LIMIT 3  [["post_id", 28]]
      Comment Load (0.2ms)  SELECT "comments".* FROM "comments"
        WHERE "comments"."post_id" = $1
        ORDER BY "comments"."created_at" DESC
        LIMIT 3  [["post_id", 29]]
      ...
If you’re using New Relic like we do, you’ll know a slow transaction has an N+1 problem when it shows many queries to the same model or table in the transaction log. Once performance starts to suffer, you’ll want to consolidate those queries.
In the previous post, we described how you could use a Postgres Window Function to find the comments you want in one query:
SELECT * FROM (
  SELECT comments.*, dense_rank() OVER (
    PARTITION BY comments.post_id
    ORDER BY comments.created_at DESC
  ) AS comment_rank
) AS ranked_comments
WHERE comment_rank < 4;
However, how can we plug this query into ActiveRecord such that we can use the data in our views?
It’s actually fairly easy. You need two new objects: a Query Object and a Decorator. Let’s refactor to introduce these objects, and then we’ll plug in our query.
The Query Object
We can perform the Extract Class refactoring and create a Feed model:
class Feed
  def initialize(posts:)
    @posts = posts.order(created_at: :desc).limit(5)
  end
  def posts
    @posts
  end
end
class PostsController < ApplicationController
  def index
    @feed = Feed.new(posts: Post.all)
  end
end
The Decorator
We can use SimpleDelegator to create a quick decorator class for Post:
class PostWithLatestComments < SimpleDelegator
  def latest_comments
    comments.order(created_at: :desc).limit(3)
  end
end
We can apply this decorator to each Post in the Feed:
class Feed
  def posts
    @posts.map { |post| PostWithLatestComments.new(post) }
  end
end
The SQL Query
At this point, we’ve done nothing except to introduce two new classes to our system. However, we’ve provided ourselves an opportunity.
We frequently use Query Objects to wrap up complex SQL without polluting models. In addition to encapsulating SQL, though, they can also hold context, empowering objects to remember the query from whence they came. We’ll use this property of query objects to plug our SQL into our application.
First, we’ll use the above SQL query to find the comments relevant to our posts:
# feed.rb
def comments
  Comment.
    select("*").
    from(Arel.sql("(#{ranked_comments_query}) AS ranked_comments")).
    where("comment_rank <= 3")
end
def ranked_comments_query
  Comment.where(post_id: @posts.map(&:id)).select(<<-SQL).to_sql
    comments.*,
    dense_rank() OVER (
      PARTITION BY comments.post_id
      ORDER BY comments.created_at DESC
    ) AS comment_rank
  SQL
end
Then, we’ll group those comments by post_id into a Hash:
class Feed
  def initialize(posts:)
    @posts = posts.order(created_at: :desc).limit(5)
    @comment_cache = build_comment_cache
  end
  # ...
  private
  def build_comment_cache
    comments.group_by(&:post_id)
  end
  # ...
end
Now, we pass that Hash to our decorator:
# feed.rb
def posts
  @posts.map { |post| PostWithLatestComments.new(post, @comment_cache) }
end
class PostWithLatestComments < SimpleDelegator
  def initialize(post, comments_by_post_id)
    super(post)
    @comments_by_post_id = comments_by_post_id
  end
  def latest_comments
    @comments_by_post_id[id] || []
  end
end
The Result
Our Feed class is now smart enough to perform two SQL queries:
- One query to poststo find the posts we care about.
- One query to comments(using Postgres Window Functions) to find the latest three comments for each post.
It then decorates each post, providing the preloaded Hash of comments to the
decorator. This allows the decorated posts to find their latest three comments
without performing an additional query.
The finished Feed class looks like this:
class Feed
  def initialize(posts:)
    @posts = posts.order(created_at: :desc).limit(5)
    @comment_cache = build_comment_cache
  end
  def posts
    @posts.map { |post| PostWithLatestComments.new(post, @comment_cache) }
  end
  private
  def build_comment_cache
    comments.group_by(&:post_id)
  end
  def comments
    Comment.
      select("*").
      from(Arel.sql("(#{ranked_comments_query}) AS ranked_comments")).
      where("comment_rank <= 3")
  end
  def ranked_comments_query
    Comment.where(post_id: @posts.map(&:id)).select(<<-SQL).to_sql
      comments.*,
      dense_rank() OVER (
        PARTITION BY comments.post_id
        ORDER BY comments.created_at DESC
      ) AS comment_rank
    SQL
  end
end
As you can see, most of the logic is concerned with generating that SQL query, and the machinery for plugging the results into our ActiveRecord models is very lightweight.
At this point, requests in our log look something like this:
Started GET "/" for 127.0.0.1 at 2014-09-18 13:53:39 -0400
Processing by PostsController#index as HTML
  Post Load (0.4ms)  SELECT "posts".* FROM "posts"
    ORDER BY "posts"."created_at" DESC
    LIMIT 5
  Comment Load (0.5ms)  SELECT * FROM (
    SELECT comments.*,
      dense_rank() OVER (
        PARTITION BY comments.post_id
        ORDER BY comments.created_at DESC
      ) AS comment_rank
    FROM "comments"
    WHERE "comments"."post_id" IN (154, 153)
    ) AS ranked_comments WHERE (comment_rank <= 3)
You can use this approach for many situations where it’s difficult to use simple eager loading.
What’s next
If you’re looking for more ActiveRecord magic, be sure to check out our Advanced ActiveRecord Querying course on Upcase, or learn how the query in this post works by reading about Postgres Window Functions.
