When to Create?

Matt Jankowski

The find_or_create_by feature in Rails’ ActiveRecord is pretty handy - sometimes you don’t know whether you have the record you want, but you’re sure you want to create one if you don’t have it yet. We’ve used this in places where the system has a concept of a remote resource with it’s own unique numeric key, and you want to use one line of code instead of 3 or 4 to express the find or create by practice.

But what about things that you have a little more control over? When is right to use that method vs. using a callback in your model to ensure you always have what you need?

The strategies

Assume you have a Group model, which has_one Forum model - and that somewhere on your group view page you’ll link to that group’s forum. Two strategies to consider…

  • Strategy A - be prepared, and create right away. When you create a new Group, immediately create a Forum for it. Possibly do this with a callback in the Group model during group creation. Otherwise, do something with transactions to make sure the Group and Forum creation are atomic.
  • Strategy B - just in time, create when needed. Use a find_or_create_by during the action which requests the Forum belonging to the relevant Group, creating it only as-requested. The Forum will exist by the time the view is rendered, if it didn’t at the time the page was requested.

Issues to consider…

The role of the model

Strategy A keeps your application objects (those which must exist for data normalization purposes but which a user doesn’t necessarily know about) creation tucked nicely away in the models, where they belong. It’s right next to a validation, so you can be sure you won’t screw anything up, because your tests will catch you if you do. Strategy B could do most of the same, but the creation process is being triggered a bit higher up (in the controller action, by virtue of the page request).

Unused records

Strategy A doesn’t care about extra data in your database. It creates things you might sometime need, at a time where it can be sure you’d earliest need them. Strategy B doesn’t want you to have a bunch of records no one cares about. It creates them when someone wants to use them, and only when someone wants to use them.

Functionality by side effect

Strategy A thinks that it feels right being in the model, and avoids the functionality by side effect feeling which sometimes comes with find_or_create_by.

Mind your verbs

Strategy B uses an HTTP GET to create something - which is usually bad and evil. In this case, you could argue that it’s OK because it’s not a first class object, it’s just something your Group needs in order to properly present a forum about itself. You could also argue that that’s really silly, and that since you’re using GET, you’ll have a bunch of search engines which crawl your site and generate all your forums for you, and really, if search engines are going to be creating things in your database, why not just create them the first time around anyway?

Room for second chances

What if you screw up somewhere? Strategy B is really forgiving! Even if you have bad data, or your server crashed in the middle of some non-atomic code, or there are solar flares outside or something - you’ll always check if you have a Forum, and you’ll always create one if you don’t. It’s like wearing a bike helmet.

Wrapping up

In conclusion, there’s no conclusion. I think that just in time creation done with a find_or_create_by vs a “be prepared” creation done with a model callback are both successful solutions to the “how do I build my non-first-class objects?” question - and choosing the approach to take is probably a matter of taste or non-generalizable project requirements.

Thanks to the thoughtbot dev team for assistance in summarizing this topic for this post.