---
title: ints or strings
teaser:
tags: web,rails
author: Jared Carroll
published_on: 2007-07-10
---

Occasionally when developing an app you have to break out an enumeration.  They
usually take the form of some name ending in 'type' e.g. group type, event type,
user type, etc.  Now as far as the rules of normalization, I'm behind the fact
that these enumerations should not be classes until someone wants to <abbr
title="Create Read Update Delete">CRUD</abbr> them.  You also end up with
brittle code, such as the following, when you prematurely make them classes:

```ruby
class ArticleType < ActiveRecord::Base

  class << self

    def news
      find :first,
        :conditions => 'name = "News"'
    end

    def essay
      find :first,
        :conditions => 'name = "Essay"'
    end

    def review
      find :first,
        :conditions => 'name = "Review"'
    end

  end

end

class Article < ActiveRecord::Base

  belongs_to :article_type

end
```

Schema:

    article_types (id, name)

    articles (id, title, article_type_id)

By defining class methods, we're basically making it convenient to say things like:

```ruby
ArticleType.news
ArticleType.essay
ArticleType.review
```

but having queries based on the value of a `varchar` column seems brittle to me.
Code like this is a smell that you have some unnecessary classes i.e. classes
with no behavior.

Let's refactor them into a attribute and corresponding constants on the
`Article` class:

```ruby
class Article < ActiveRecord::Base

  NEWS = 'News'
  ESSAY = 'Essay'
  REVIEW = 'Review'

end
```

Our schema changes to move the previous `article_types` `name` column into `articles`:

    articles (id, title, article_type)

And we reference them in code like:

```ruby
Article::NEWS
Article::ESSAY
Article::REVIEW
```

On articles#new we'll show a form to create a new `Article` and use a drop down
of available `ArticleType`'s, in order to set the `Article`'s `ArticleType`, so
we'll need to add another constant to `Article` for convenience.

```ruby
class Article < ActiveRecord::Base

  NEWS = 'News'
  ESSAY = 'Essay'
  REVIEW = 'Review'

  TYPES = NEWS, ESSAY, REVIEW

end
```

Now we can reference it in a view easily like:

```erb
<% form_for :article, :url => articles_path do |form| %>
  <!-- other article attributes -->
  <%= form.select :article_type, Article::TYPES, :include_blank => true %>
  <!-- other article attributes -->
<% end %>
```

Ok thats nice.

But let's take a look at those constants defined in `Article`.  For example:

```ruby
class Article < ActiveRecord::Base

  ESSAY = 'Essay'

end
```

There's a constant named `ESSAY` who's value is 'Essay'.  That doesn't feel good.

Now back in the day, in C/C++ I'd write enumerations like:

    enum { NEWS, ESSAY, REVIEW }

This was basically defining constants and converting them to integers starting
at 0.  So the value of `NEWS` was 0, `ESSAY` was 1, `REVIEW` was 2.  The bottom
line is that they were numbers not strings.

Let's rewrite the `Article` class making the constants numbers instead of strings:

```ruby
class Article < ActiveRecord::Base

  NEWS = 0
  ESSAY = 1
  REVIEW = 2

  # or as a 1-liner
  # NEWS, ESSAY, REVIEW = 0, 1, 2

  TYPES = NEWS, ESSAY, REVIEW

end
```

Ok.

Now that's not as strange and redundant as them having basically the same value
as their name i.e. the value of `NEWS` was 'News'.  However, we now need
something to convert these numbers to strings to display in the drop down list
in the form for creating an `Article`.

I'll put it in _app/helpers/articles_helper.rb_:

```ruby
module ArticlesHelper

  def article_types
    [['News', Article::NEWS],
     ['Essay', Article::ESSAY],
     ['Review', Article::REVIEW]]
  end

end
```

And our view can remain the same, because `#select`'s 2nd argument expects an
array of 2-element arrays (`[text_to_display, value_to_POST]`):

```erb
<% form_for :article, :url => articles_path do |form| %>
  <!-- other article attributes -->
  <%= form.select :article_type, Article::TYPES, :include_blank => true %>
  <!-- other article attributes -->
<% end %>
```

So by turning the enumeration into a 'classic' enumeration using numbers we
needed to add 1 method `ArticlesHelper#article_types`.  It belongs in a helper
because its related to view logic.  Now I did this because it felt redundant to
have enumeration values who's value is the same as their name e.g. `ESSAY`'s
value is 'Essay'.

Now is there another reason behind the tendency to use numbers instead of
strings as enumeration values besides this strange feeling of redundancy?
Looking at performance, I bet in <abbr title="Structured Query
Language">SQL</abbr> its quicker to do:

```sql
select *
from articles
where article_type = 1
```

than

```sql
select *
from articles
where article_type = 'News'
```

In other words, a number comparison is faster than a string comparison in a
'where' clause.  I've noticed the tendency in older apps to represent
enumerations as numbers, usually called codes, and thought maybe it was a
performance optimization.  There'd typically be a lookup table for the text
instead of a method in the application like our `ArticlesHelper#article_types`,
probably because the database was being used by more than 1 app.  So there'd be
a schema like:

    articles (id, article_type_code)

    article_types (article_type_code, name)

That `article_types` table would be a look up table.

I really doubt you'd feel any performance gains from turning a string based
enumeration into a number based enumeration though.

To me its starting to feel more natural writing 'classic' enumerations, who's
values are numbers and having a view method to handle the conversion when
displaying the enumeration values to the end user.  The redundancy of the
constant who's name is the same as its value is really starting to get to me.
