What To Do When You Don't Know What To Do

Sometimes, test driven development is really annoying, and the abundance of untested applications out there suggests that I am not the first person to have had this thought.

The thing is, there are limitations to test driven development. The workflow of test development essentially goes like this:

  1. Create a user story that encapsulates a chunk of functionality of the application
  2. Write tests for this story (which should fail before you have implemented anything)
  3. Write the code to make the tests pass

What is missing from this flow, however, is a period of time to consider the overall design of the code in the application. If you’re not careful, your program can end up like an old school map of Africa - the small details are right, but the overall structure is off.


Test Driven Development rests on the assumption that you basically know the optimal way to make your tests pass in advance. If you’ve written several similar apps, you probably do know how to fill in your tests. However, if you’re new to programming, or if you’re creating a unique program, you may not.

Recently, my coworker Jason and I encountered this problem when working on the open source app GNITE. GNITE requires an internal large tree structure that can support up to 5 million nodes. Neither of us had worked with trees that large before, and we didn’t know what the stumbling blocks were going to be. So, we decided to spike it out.

A spike is a quickly written, untested piece of code that’s intended to be thrown away. You intentionally abandon your best practices in the interest of quickly learning about good ways to design your app. Once you’ve gained everything you can from it, however, you scrap it and don’t build anything else off it.

In our case, we started out fooling about with jsTree, to display the tree, and the ruby gem ancestry, to organize the tree structure in the database. (If you want to check it out, it’s up on github here and you can see it working here.) As it turned out, ancestry and jsTree worked well together - ancestry stores the entire lineage of each node on the node itself, which made things like asynchronous search work well with jsTree.

Happy with our initial findings, and looking to push our luck, we extended our program to parse large, compressed tree files. Once we had the large tree in the database, the display behavior remained satisfactory, but the act of parsing a tree took a long time - too long to be a regular occurrence on the website. Initially, we had considered making parsing trees a regular part of the work flow, but the 5 to 10 minute load times suggested that we would need to look into some sort of caching (or, possibly, a different way of storing nodes on the back end with a quicker insertion.)

In the end, we decided caching uncompressed trees in the database could be a potential solution, given the amount of space we had. In our case, the most straightforward approach for displaying trees worked out well, but not for saving and loading trees.

Now that we have a loose idea of the structure we’re going to pursue, we can keep it in mind when we’re writing our code. And, we have some really relevant example code to work off, which will (hopefully) make the “getting the tests to pass” step in —