The find_or_create_by
feature in Rails’ ActiveRecord
is pretty handy -
sometimes you don’t know whether you have the record you want, but you’re sure
you want to create one if you don’t have it yet. We’ve used this in places
where the system has a concept of a remote resource with it’s own unique numeric
key, and you want to use one line of code instead of 3 or 4 to express the find
or create by practice.
But what about things that you have a little more control over? When is right to use that method vs. using a callback in your model to ensure you always have what you need?
The strategies
Assume you have a Group
model, which has_one
Forum
model - and that
somewhere on your group view page you’ll link to that group’s forum. Two
strategies to consider…
- Strategy A - be prepared, and create right away. When you create a new
Group
, immediately create aForum
for it. Possibly do this with a callback in theGroup
model during group creation. Otherwise, do something with transactions to make sure theGroup
andForum
creation are atomic. - Strategy B - just in time, create when needed. Use a
find_or_create_by
during the action which requests theForum
belonging to the relevantGroup
, creating it only as-requested. TheForum
will exist by the time the view is rendered, if it didn’t at the time the page was requested.
Issues to consider…
The role of the model
Strategy A keeps your application objects (those which must exist for data normalization purposes but which a user doesn’t necessarily know about) creation tucked nicely away in the models, where they belong. It’s right next to a validation, so you can be sure you won’t screw anything up, because your tests will catch you if you do. Strategy B could do most of the same, but the creation process is being triggered a bit higher up (in the controller action, by virtue of the page request).
Unused records
Strategy A doesn’t care about extra data in your database. It creates things you might sometime need, at a time where it can be sure you’d earliest need them. Strategy B doesn’t want you to have a bunch of records no one cares about. It creates them when someone wants to use them, and only when someone wants to use them.
Functionality by side effect
Strategy A thinks that it feels right being in the model, and avoids the
functionality by side effect feeling which sometimes comes with
find_or_create_by
.
Mind your verbs
Strategy B uses an HTTP GET to create something - which is usually bad and evil.
In this case, you could argue that it’s OK because it’s not a first class
object, it’s just something your Group
needs in order to properly present a
forum about itself. You could also argue that that’s really silly, and that
since you’re using GET, you’ll have a bunch of search engines which crawl your
site and generate all your forums for you, and really, if search engines are
going to be creating things in your database, why not just create them the first
time around anyway?
Room for second chances
What if you screw up somewhere? Strategy B is really forgiving! Even if you
have bad data, or your server crashed in the middle of some non-atomic code, or
there are solar flares outside or something - you’ll always check if you have a
Forum
, and you’ll always create one if you don’t. It’s like wearing a bike
helmet.
Wrapping up
In conclusion, there’s no conclusion. I think that just in time creation done
with a find_or_create_by
vs a “be prepared” creation done with a model
callback are both successful solutions to the “how do I build my non-first-class
objects?” question - and choosing the approach to take is probably a matter of
taste or non-generalizable project requirements.
Thanks to the thoughtbot dev team for assistance in summarizing this topic for this post.