Less Painful Heroku Deploys with Migrations

Joe Ferris

A Rails app running on Heroku must be restarted after migrating the database.

Why

When a Rails app boots, the ActiveRecord classes reflect on the database to determine which attributes and associations to add to the models. The config.cache_classes setting is true in production mode and false in development mode.

During development, we can write and run a migration and see the change take effect without restarting the web server but in production we need to restart the server for the ActiveRecord classes to learn about the new information from the database. Otherwise, the database will have the column but the ActiveRecord classes will have been cached based on the old information.

You may find that you have issues sometimes that I call “stuck dynos”. This is where not every process seems to be aware of new columns. Restarting your Heroku app will fix this problem.

Introspecting

You can introspect your running Heroku application to see if this problem has occurred.

Here’s a real example:

$ production ps
=== web: `bundle exec rails server thin start -p $PORT -e $RACK_ENV`
web.1: up 2013/01/25 16:33:07 (~ 18h ago)
web.2: up 2013/01/25 16:47:15 (~ 18h ago)

=== worker: `bundle exec rake jobs:work`
worker.1: up 2013/01/25 17:30:58 (~ 17h ago)

The ups tell me the processes are running. They will say crashed if there’s a problem.

If I were to run production tail (from Parity), I might just be watching the stream looking for anything unusual, like 500 errors. Sometimes error reporting services are delayed so there is no faster way to know about a post-deploy issue than tailing the logs while running through some critical workflows in the app.

If something looks unusual, I might then move over to the logging service we have set up (typically Splunk Storm or Papertrail) to run some searches to see how often the problem is coming up or if it looks new post-deploy.

New Relic or Airbrake will likely have more backtrace information by this time and we can make a decision about whether to roll back the deploy, or work on a hot fix as the next action, or record the bug and place it lower in the backlog.

Avoiding the whole issue

This is such a common issue that Parity includes a command to migrate the database, and then immediately restart the web processes after the migration finishes:

production migrate

This obviates the “stuck dyno” problem and minimizes the time between migration and restart.

What’s next

If you found this useful, you might also enjoy: