Return an Enumerator When Your Collection Has Multiple Traversals

Joël Quenneville

A Ruby collection can only define a single #each to use with the Enumerable module. This is fine when the collection has one sensible way to be traversed, but what about collections with multiple valid ways of being traversed?

Binary trees have multiple traversals

3 diagrams showing traversing a binary tree in pre-order, in-order, and post-order

Unlike an array where usually there’s only a single way you’d want to traverse it, binary trees have many common ways of being traversed. Let’s say you’re building a binary tree object and you want to support the 3 classic depth-first traversals:

  1. Pre-order - evaluate a node itself first, then its left subtree, then its right subtree
  2. In-order evaluate the left subtree, then the node itself, then the right subtree
  3. Post-order evaluate the left subtree, then the right subtree, then finally the node itself.

We can implement the 3 traversals as methods on our object. These will yield one item at a time in the order given by our traversal, allowing us to write code that iterates through a tree in a particular order like like: my_tree.preorder { |item| ... }

class BinaryTree
  def preorder(&block)
    block.call(val)
    left.preorder(&block)
    right.preorder(&block)
  end

  def inorder(&block)
    ...
  end

  def postorder(&block)
    ...
  end
end

Limitations of Enumerable

But what if we want some of the goodies supplied by the Enumerable module?

Enumerable wants a method named #each in order to be included into your collection class. We can alias one of our traverals to #each, but the problem is that this forces us to pick one. If we alias #eachto#preorder, now all enumerable methods on our collection only work in pre-order. If we want to #find in post-order, we’re out of luck.

class BinaryTree
  include Enumerable

  # all the enumerable methods will happen in preorder
  alias_method :each, :preorder

  # ...
end

Returning an Enumerator

Instead of reaching for Enumerable (the module), we can take a different approach using Enumerator (the class). One way to think about enumerators is that they encode different traversals of a collection.

Leaning on this mental model, we can return an enumerator from each of our traversal methods. Enumerator objects have access to all the enumerable methods we know and love. This allows the caller of your tree object to pick the traversal they want like:

  • tree.inorder.find { ... }
  • tree.postorder.drop_while { ... }
  • tree.preorder.map { ... }

Making this change turns out to be a one-liner thanks to Ruby’s Object#to_enum method that will construct an enumerator out of the given traversal method.

def preorder(&block)
  return to_enum(:preorder) unless block_given?

  block.call(val)
  left.preorder(&block)
  right.preorder(&block)
end

Curious how this how this BinaryTree class looks altogether? Check out the full implementation as a gist.

Recommendation

This all leads to the following heuristic I use when creating iterable collections in Ruby:

If a collection has more than one valid traversal, return Enumerator instead of including Enumerable

This heuristic was partly inspired by this older article on when returning enumerators is preferable.

Teaching Ruby to Count

Want to dig more into these ideas? Check out my talk Teaching Ruby to Count. The talk explores how to make the most of out ranges, enumerators, and enumerables as well as how to make your custom objects play just as nicely with them as built-in Ruby primitives.