Occasionally when developing an app you have to break out an enumeration. They usually take the form of some name ending in ‘type’ e.g. group type, event type, user type, etc. Now as far as the rules of normalization, I’m behind the fact that these enumerations should not be classes until someone wants to CRUD them. You also end up with brittle code, such as the following, when you prematurely make them classes:
class ArticleType < ActiveRecord::Base
class << self
def news
find :first,
:conditions => 'name = "News"'
end
def essay
find :first,
:conditions => 'name = "Essay"'
end
def review
find :first,
:conditions => 'name = "Review"'
end
end
end
class Article < ActiveRecord::Base
belongs_to :article_type
end
Schema:
article_types (id, name)
articles (id, title, article_type_id)
By defining class methods, we’re basically making it convenient to say things like:
ArticleType.news
ArticleType.essay
ArticleType.review
but having queries based on the value of a varchar
column seems brittle to me.
Code like this is a smell that you have some unnecessary classes i.e. classes
with no behavior.
Let’s refactor them into a attribute and corresponding constants on the
Article
class:
class Article < ActiveRecord::Base
NEWS = 'News'
ESSAY = 'Essay'
REVIEW = 'Review'
end
Our schema changes to move the previous article_types
name
column into articles
:
articles (id, title, article_type)
And we reference them in code like:
Article::NEWS
Article::ESSAY
Article::REVIEW
On articles#new we’ll show a form to create a new Article
and use a drop down
of available ArticleType
‘s, in order to set the Article
’s ArticleType
, so
we’ll need to add another constant to Article
for convenience.
class Article < ActiveRecord::Base
NEWS = 'News'
ESSAY = 'Essay'
REVIEW = 'Review'
TYPES = NEWS, ESSAY, REVIEW
end
Now we can reference it in a view easily like:
<% form_for :article, :url => articles_path do |form| %>
<!-- other article attributes -->
<%= form.select :article_type, Article::TYPES, :include_blank => true %>
<!-- other article attributes -->
<% end %>
Ok thats nice.
But let’s take a look at those constants defined in Article
. For example:
class Article < ActiveRecord::Base
ESSAY = 'Essay'
end
There’s a constant named ESSAY
who’s value is 'Essay’. That doesn’t feel good.
Now back in the day, in C/C++ I’d write enumerations like:
enum { NEWS, ESSAY, REVIEW }
This was basically defining constants and converting them to integers starting
at 0. So the value of NEWS
was 0, ESSAY
was 1, REVIEW
was 2. The bottom
line is that they were numbers not strings.
Let’s rewrite the Article
class making the constants numbers instead of strings:
class Article < ActiveRecord::Base
NEWS = 0
ESSAY = 1
REVIEW = 2
# or as a 1-liner
# NEWS, ESSAY, REVIEW = 0, 1, 2
TYPES = NEWS, ESSAY, REVIEW
end
Ok.
Now that’s not as strange and redundant as them having basically the same value
as their name i.e. the value of NEWS
was ‘News’. However, we now need
something to convert these numbers to strings to display in the drop down list
in the form for creating an Article
.
I’ll put it in app/helpers/articleshelper.rb_:
module ArticlesHelper
def article_types
[['News', Article::NEWS],
['Essay', Article::ESSAY],
['Review', Article::REVIEW]]
end
end
And our view can remain the same, because #select
‘s 2nd argument expects an
array of 2-element arrays ([text_to_display, value_to_POST]
):
<% form_for :article, :url => articles_path do |form| %>
<!-- other article attributes -->
<%= form.select :article_type, Article::TYPES, :include_blank => true %>
<!-- other article attributes -->
<% end %>
So by turning the enumeration into a 'classic’ enumeration using numbers we
needed to add 1 method ArticlesHelper#article_types
. It belongs in a helper
because its related to view logic. Now I did this because it felt redundant to
have enumeration values who’s value is the same as their name e.g. ESSAY
‘s
value is 'Essay’.
Now is there another reason behind the tendency to use numbers instead of strings as enumeration values besides this strange feeling of redundancy? Looking at performance, I bet in SQL its quicker to do:
select *
from articles
where article_type = 1
than
select *
from articles
where article_type = 'News'
In other words, a number comparison is faster than a string comparison in a
‘where’ clause. I’ve noticed the tendency in older apps to represent
enumerations as numbers, usually called codes, and thought maybe it was a
performance optimization. There’d typically be a lookup table for the text
instead of a method in the application like our ArticlesHelper#article_types
,
probably because the database was being used by more than 1 app. So there’d be
a schema like:
articles (id, article_type_code)
article_types (article_type_code, name)
That article_types
table would be a look up table.
I really doubt you’d feel any performance gains from turning a string based enumeration into a number based enumeration though.
To me its starting to feel more natural writing ‘classic’ enumerations, who’s values are numbers and having a view method to handle the conversion when displaying the enumeration values to the end user. The redundancy of the constant who’s name is the same as its value is really starting to get to me.