Read between the lines

Joe Ferris

People frequently use lines of code as a metric. How many times have you heard someone say, “it’s only 10 lines?” or “This class is almost 200 lines?” Have you ever heard somebody boast that they could write something in fewer lines of code? Do you talk about your test to code ratio?

How many lines of code something takes doesn’t tell us much about the code. Let’s take a closer look at some better metrics.

Complexity

Focusing on lines of code encourages developers to jam as much complexity as they can into each line.

Compare this example:

def credit
  user && coupon = user.coupons.where(:product_id => product.id).first.try(:amount) || 0
end

To this one:

def credit
  user && user.coupon_for(product).amount || 0
end

# On User
def coupon_for(product)
  coupons.where(:product_id => product.id).first.try(:amount)
end

Although the first example is only three lines of code, it has a lot of complexity in one method. That means this method is harder to understand, reuse, modify, or extract. Flog rates this method at 13.9.

The second example example has twice as many lines, but each method is less complex. Flog rates the second example at 8.1 per method.

Even in the second example, the methods are dense. Further refactoring would result in more lines of code, but smaller, simpler methods.

Readability

The previous example also improves readability. Every method you extract is an opportunity to clarify something by naming it, while every piece of complexity you add to a line is anonymous.

Code is frequently more readable when it expands over multiple lines of code. Here’s a common Ruby idiom:

users.each(&:friend_added) if new_record?

I like to call this the “if surprise.” It’s easy to scan down a method and miss that dangling if, not realizing that the method you’re debugging actually has a logical fork.

I prefer this:

if new_record?
  users.each(&:friend_added)
end

It’s three times as many lines, but the same amount of complexity, and more readable.

Test to code ratio

Another place that people obsess over lines is their test to code ratio. Rails even comes with a task to print this stat out for you, and many presentations use this stat when bragging about their test coverage (or lack thereof). There are a few things that make this metric meaningless:

  • Some frameworks create taller code. RSpec tests tend to have more lines than the same Test::Unit test, despite having the same coverage.
  • The best way to have a high test to code ratio is to copy and paste test code, which is the testing equivalent of never flushing your toilet.
  • How many tests you have is a poor metric for measuring the quality of the test suite.

Here are some questions that can be useful when measuring your test suite:

  • Were all the tests written before the code that they motivate?
  • Do you have 100% C0 coverage? Ask rcov or simplecov.
  • Do your tests break if your code breaks?
  • Is there duplication in your tests?
  • Can you read your tests easily?
  • If you add new code, do old tests break?
  • Can you refactor one class without breaking the tests for other classes?

No code is faster than no code

Merb popularized this saying and encouraged users not to use code they didn’t need. However, it’s important not to misinterpret this by assuming that fewer lines of code means less code, and therefore faster code.

Here’s an example of a minimal Fibonacci sequence:

class Fibber
  def fib(n)
    if n < 2
      n
    else
      fib(n - 1) + fib(n - 2)
    end
  end
end

This implementation is pretty slow. It took almost five seconds to print out the first 35 numbers in the sequence on my machine. You can improve the performance dramatically with just a few more lines:

class Fibber
  def initialize
    @cache = {}
  end

  def fib(n)
    if n < 2
      n
    else
      @cache[n] ||= fib(n - 1) + fib(n - 2)
    end
  end
end

This takes around a millisecond on my machine.

Your application will perform better if you remove functionality you don’t need, but removing lines and watching the tests still pass doesn’t mean you haven’t lost anything.

Edge cases

In version 2.4.0 of factorybot, we added almost three hundred lines of code and almost no features. That seems like a lot of code for no new functionality, but what actually happened is that we fixed a number of annoying bugs. Fixing the last few edge cases for a library can be a huge undertaking; the last ten percent frequently takes as long as the first ninety percent. There have been times that we’ve been tempted to rewrite factorybot, because it doesn’t seem that hard to implement at a glance. However, a new implementation would reintroduce bugs that have been fixed for years.

If you’re thinking of rewriting something, remember that it’s probably not as easy as it looks. You don’t just have to reimplement the features; you have to fix the edge cases, too.

Love your lines

Let’s stop thinking in terms of lines of code. Here are a few tips:

  • Think in terms of complexity, not lines
  • Use flog or rcov, not wc -l
  • If you want to remove complexity or improve performance, remove features and not just code

As a community, we’ve developed a number of techniques and tools for evaluating code. We can do better than just counting lines.