Validation, Database Constraint, or Both?

Derek Prior

The validations provided by Rails are extensive. They cover presence, uniqueness, format, numericality (sic), and more. For any given constraint on your data it’s quite often possible to construct a one-line validates statement composed of the provided validators that protects your application from “garbage” data.

When validations aren’t enough

As we’ve covered previously, some of these application layer validations are insufficient for ensuring database integrity and must be backed with database constraints for this purpose. Each uniqueness constraint, for example, must be backed by a unique database index to protect against race conditions. Developers must also consider if presence validations should be backed by null: false constraints in the schema, or if inclusion validations should be backed by check constraints.

When all is said and done, it’s not uncommon to duplicate the majority of validations in schema and models. For example:

create_table "users", force: :cascade do |t|
  t.name :string, null: false
  t.username :string, null: false, index: true
  t.encrypted_password :string, null: false
  t.belongs_to :organization,
    null: false,
    index: :true,
    foreign_key: { on_delete: :cascade }
  t.timestamps

  t.index :username, unique: true
end
class User < ApplicationRecord
  attr_accessor :password

  belongs_to :organization

  validates :name, presence: true
  validates :username, presence: true, uniqueness: true
  validates :password, presence: true
  validates :encrypted_password, presence: true
end

The application built on this schema with this User model allows users to register, reset their password, and change their name. The user is associated with an organization at registration time via route parameter.

What happened to DRY?

At first glance there’s a good deal of duplication here. However, it’s important to recognize that our schema constraints serve a different purpose than our model validations.

Schema constraints ensure the data persisted to our database, regardless of the source, is consistent and makes sense in the context of our domain. The schema for this application tells us that for users to be valid in our domain they must have a name, username, encrypted password, and a related organization. The schema further enforces consistency by ensuring that username is unique and that the related organization exists. The cascading foreign key ensures that deleting an organization row will also delete any related user rows, preventing invalid foreign key references.

Model validations, by contrast, provide a user interface around errors. If an application user leaves the name field blank, the presence validator will add a helpful error message that the user can address when the form is re-rendered.

Deciding between a validation and a constraint

It is not necessary to back each validation with a schema constraint, nor is it necessary for schema constraints to be reflected as model validations. There are a couple of questions worth asking as you decide which is appropriate for your use case.

  1. Are you trying to prevent bad data from being written to the database? If so, you must have a schema constraint. Unfortunately, Omakase Rails doesn’t natively support the creation and schema dumping of all common constraints supported by Postgres, so you must also weigh this in your decision making.
  2. Are you preventing errors that your application user can fix for themselves? If so, you should use a model validation.

Revisting our users example

Let’s ask ourselves those questions about our users table and User model from the example above.

We’ve done well in our schema definition. Each of the constraints present exists to prevent bad data from being recorded and there are no obvious constraints missing. We even made sure to add a unique index on the username column.

Reviewing the validations on User, however, shows areas for improvement. Our application allows users to specify their name, username, and password. No other field on User is exposed to user input, yet we have validations on each field.

Let’s consider the presence validation on encrypted_password. If somehow a registration were attempted that did not set encrypted_password, then our application has a bug. The validation on this field will cause our controller to re-render the registration form. At best, our form renders all error messages and will display an error telling the user, “Encrypted password is required”. At worst, the form will display no error at all because the encypted_password field is not present in the form. In either case, there’s nothing the user can do to address the problem themselves and the latter case is downright puzzling to users.

Once we remove the validation from the model, that same scenario will result in an application error due to the null: false constraint on that column in the database schema. The user still can’t register, but the application error will trigger a report to our error monitoring service which will alert us to our bug.

Perhaps surprisingly, we may also have this same problem with organization. Beginning with Rails 5, belongs_to associations are marked as required by default in newly generated applications. organization is set automatically by our controller using data from the route and never selected by the user. If a bug exists in this process, the belongs_to association will mask it by adding the validation error “Organization must exist”.

With these issues in mind, we can rewrite our User model like so:

class User < ApplicationRecord
  attr_accessor :password

  belongs_to :organization, optional: true

  validates :name, presence: true
  validates :username, presence: true, uniqueness: true
  validates :password, presence: true
end

We’ve removed the validation on encrypted_password because it does not prevent an error that our user can fix. Similarly, we declared organization as an optional association. If the latter change feels too much like a lie to you, you can change this behavior application wide by editing the generated active_record_belongs_to_required_by_default.rb:

Rails.application.config.active_record.belongs_to_required_by_default = false

Impact on application design

As you move data integrity constraints to the database, it becomes easier to lessen the burden on your ActiveRecord models through the use of specialized form objects as we now have the safety net provided by database constraints.

Form objects provide the user interface to creating one or many models. Once we view validations as providing a user interface to form errors, it’s logical for them to include the necessary validations as well, even at the cost of potential duplication.

Form objects exist in a context and can add validations and messaging appropriate for that exact context. The context eliminates the smell of conditional validations often seen in ActiveRecord models. As our example application grows, for instance, we may end up with a RegistrationForm, ProfileForm, and PasswordResetForm, each with their own contextual validations and a User class completely devoid of validations itself.