Database views are powerful SQL constructs that are unfortunately often overlooked by Rails application developers because views lack out-of-the-box support from Rails itself. Many of the more complex ActiveRecord scopes, relationships, and query objects we’ve come across could more clearly and composably be expressed as database views, and yet they seldom are.
We’ve used database views to generate complex reports, implement full-text search across multiple models, and promote concepts hidden in our data to proper domain objects in our Rails applications.
Don’t take our word for the importance of SQL views. Here’s what the PostgreSQL documentation itself has to say on the matter:
Making liberal use of views is a key aspect of good SQL database design. Views allow you to encapsulate the details of the structure of your tables, which might change as your application evolves, behind consistent interfaces.
Scenic adds first class support for database views to ActiveRecord migrations.
Scenic goes beyond basic create_view
and drop_view
support to offer a view
versioning system that makes maintaining views as they change as simple and
robust as managing tables via migrations. It also ships with extensive support
for materialized views.
Scenic has been open source for quite some time now, but we kept it quiet while we dog-fooded internally and gathered feedback on its use from our early users. We released version 1.0.0 in late November and we’ve just recently shipped version 1.1.0.
Scenic by example
I recently worked with the Upcase team to improve site performance. One look at the data from Skylight showed several N+1 queries that were resistant to the typical prescription of eager loading. One such resistant N+1 query was in an action that showed users flashcards that they had flagged as needing further review.
The code for the relationships at play looked like this:
class Flashcard < ActiveRecord::Base
has_many :attempts, -> { order(created_at: :desc) }
def most_recent_attempt_for(user)
attempts.where(user: user).first || NullAttempt.new
end
end
class Attempt
belongs_to :user
belongs_to :flashcard
end
As you may have noticed, flashcards have many attempts by the same user.
Attempts have a confidence
attribute and the intent was to track how a user’s
confidence in a topic moves over time. The most_recent_attempt_for
method
hints that there’s a concept that is hiding in our data, as we’re about to see.
The list of flashcards that a user did not have high confidence in was retrieved using this query object:
class FlashcardsNeedingReviewQuery
CONFIDENCE_THRESHOLD = 4
def initialize(user)
@user = user
end
def run
Flashcard.
all.
map { |q| q.most_recent_attempt_for(@user) }.
select { |a| a.confidence > 0 && a.confidence < CONFIDENCE_THRESHOLD }.
map(&:flashcard)
end
end
For each flashcard in the system, we’re getting the user’s most recent attempt, filtering to those that do not meet our confidence threshold, and finally mapping back to the flashcard object itself. Not only is this an N+1 query, but we’re also generating thousands of short-lived ruby objects which increase garbage collection pressure, rearing its head as a performance problem in other unrelated actions.
The data we collect gives us a history of a user’s confidence level, but here our application only cares about the confidence level of their most recent attempt. Let’s promote the idea of a most recent attempt on a flashcard to a proper domain concept.
Enter Scenic
With Scenic added to our Gemfile and installed, we have access to some handy generators that are going to help us create a view to encapsulate this domain object.
$ rails generate scenic:model latest_attempt
This creates a few things for us:
- A migration that runs
create_view :latest_attempts
. - A SQL file for our view definition at
db/views/latest_attempts_v01.sql
- A model that is backed by our view at
app/models/latest_attempt.rb
Let’s start by defining our SQL view in latest_attempts_v01.sql
. We want this
view to return only the latest attempt for each user for each flashcard. Free of
ActiveRecord, we can confidently use all of the features of SQL to do this. In
this case, we’ll use DISTINCT ON
, which keeps only the first row
of each set of rows where the provided columns are equal.
SELECT DISTINCT ON (user_id, flashcard_id) *
FROM attempts
ORDER BY user_id, flashcard_id, updated_at DESC
With our view definition file saved, run rake db:migrate
to create the view.
Let’s skip the model code for a minute and update the query object to have the
code we wish existed using this new LatestAttempt
model. How about something
like:
def run
LatestAttempt.by(@user).
confidence_below(CONFIDENCE_THRESHOLD).
includes(:flashcard).
map(&:flashcard)
end
That looks like a decently composable API for LatestAttempt
, so let’s get to
implementing it in our model. Our model is backed by a database view, but we’re
still able to use our familiar ActiveRecord syntax for querying and
relationships.
class LatestAttempt < ActiveRecord::Base
belongs_to :user
belongs_to :flashcard
def self.by(user)
where(user_id: user.id)
end
def self.confidence_below(level)
where("confidence < ?", level)
end
end
With all of that in place, our N+1 query is gone and we’ve significantly decreased garbage collection pressure. A benchmark of both methods with production data shows that the previously existing Ruby implementation is 85 times slower than our new implementation that makes use of the view. Equally important, we’ve promoted the previously hidden concept that flashcards have most recent attempts by each user to a proper domain object that can be reused elsewhere in our system.
Next steps
Please check out Scenic today. We’re confident you can find a use for database views in your application that will improve the performance and readability of your code.
If you’d like to hear more inside baseball about the development of Scenic, check out episode 45 of The Bike Shed, where Derek and Sean discuss the challenges encountered in developing and versioning Scenic.