find_or_create_by feature in Rails’
ActiveRecord is pretty handy -
sometimes you don’t know whether you have the record you want, but you’re sure
you want to create one if you don’t have it yet. We’ve used this in places
where the system has a concept of a remote resource with it’s own unique numeric
key, and you want to use one line of code instead of 3 or 4 to express the find
or create by practice.
But what about things that you have a little more control over? When is right to use that method vs. using a callback in your model to ensure you always have what you need?
Assume you have a
Group model, which
Forum model - and that
somewhere on your group view page you’ll link to that group’s forum. Two
strategies to consider…
- Strategy A - be prepared, and create right away. When you create a new
Group, immediately create a
Forumfor it. Possibly do this with a callback in the
Groupmodel during group creation. Otherwise, do something with transactions to make sure the
Forumcreation are atomic.
- Strategy B - just in time, create when needed. Use a
find_or_create_byduring the action which requests the
Forumbelonging to the relevant
Group, creating it only as-requested. The
Forumwill exist by the time the view is rendered, if it didn’t at the time the page was requested.
Issues to consider…
Strategy A keeps your application objects (those which must exist for data normalization purposes but which a user doesn’t necessarily know about) creation tucked nicely away in the models, where they belong. It’s right next to a validation, so you can be sure you won’t screw anything up, because your tests will catch you if you do. Strategy B could do most of the same, but the creation process is being triggered a bit higher up (in the controller action, by virtue of the page request).
Strategy A doesn’t care about extra data in your database. It creates things you might sometime need, at a time where it can be sure you’d earliest need them. Strategy B doesn’t want you to have a bunch of records no one cares about. It creates them when someone wants to use them, and only when someone wants to use them.
Strategy A thinks that it feels right being in the model, and avoids the
functionality by side effect feeling which sometimes comes with
Strategy B uses an HTTP GET to create something - which is usually bad and evil.
In this case, you could argue that it’s OK because it’s not a first class
object, it’s just something your
Group needs in order to properly present a
forum about itself. You could also argue that that’s really silly, and that
since you’re using GET, you’ll have a bunch of search engines which crawl your
site and generate all your forums for you, and really, if search engines are
going to be creating things in your database, why not just create them the first
time around anyway?
What if you screw up somewhere? Strategy B is really forgiving! Even if you
have bad data, or your server crashed in the middle of some non-atomic code, or
there are solar flares outside or something - you’ll always check if you have a
Forum, and you’ll always create one if you don’t. It’s like wearing a bike
In conclusion, there’s no conclusion. I think that just in time creation done
find_or_create_by vs a “be prepared” creation done with a model
callback are both successful solutions to the “how do I build my non-first-class
objects?” question - and choosing the approach to take is probably a matter of
taste or non-generalizable project requirements.
Thanks to the thoughtbot dev team for assistance in summarizing this topic for this post.