The part where our hero thinks everything is fine
Recently, I worked on a project where we needed to build a Ruby service that would run queries on a third party database and send the results to our main Rails application. The third party database was a huge MSSQL database with tons of data, but we were only using a handful of tables. As I started building the service, I wrote specs for the Ruby code, but mocked all the database calls to return predefined data. This gave me confidence that my Ruby code worked, but the core purpose of the service was querying the database, and that was entirely untested.
The part where our hero realizes everything is not, in fact, fine
I threw a ‘write SQL tests’ card in the tech debt backlog and carried on for a while, tweaking the queries and ignoring the little voice in the back of my head that screamed in agony every time I committed a change to a SQL query string with no corresponding spec. Eventually, it became clear that the queries I had written were more likely to change over time than I initially thought, and that everyone else on the project avoided them. It was time to put on my grown up hat and write some specs.
The part where our hero finds herself in unfamiliar territory
This service was missing two things that I am usually able to rely on when
writing specs: Postgres and Rails (specifically, Active Record). It’s fairly
trivial to get Postgres running on my local development environment (I use a
Mac), and on the Docker images I usually use for CI (in this case,
ubuntu-latest
on Github Actions).
One of my first thoughts was to use Postgres for specs and just sort of cross my fingers that that the syntax I used for queries would work on both database platforms. In the Ruby service, I used TinyTDS to connect to the external database, which worked really well for actually executing the queries. I was hoping I could just swap out Postgres for MSSQL and use TinyTDS with Postgres in the tests, but unfortunately, TinyTDS is for Microsoft SQL or Sybase databases and there really is no way to make it work with Postgres.
The part where our hero contemplates fleeing back to familiar grounds
Next, I considered whether refactoring the service to use some Rails components would make testing easier. I thought I might be able to take advantage of Active Record to not only make queries simpler, but also make it possible to connect to a MSSQL database in production and test on a Postgres database, leaving the hard bits up to the respective adapters. The more I thought about this, I realized that although it would be a potentially useful refactor in many ways, attempting a refactor of this complexity on code that is completely untested was way too risky. Unless I found a time machine that would let me go back in time and start with Active Record from the first place, I was going to need to find a way to write specs without it.
The part where our hero faces her demons and begins to see light
While developing the service initially, I used the Docker image Microsoft provides to run SQL Server locally. This seemed like a good place to start for running specs, too. It turned out to be straightforward to remove the mocks from the test code and have specs use TinyTDS to connect to the test database. I used Climate Control to set the environment variables used by TinyTDS in tests to the values I had used during development.
Now that I had specs connecting to a database, I needed to set up the database with the schemas, tables, and functions the service depended on. I also needed a way to create objects in the database and reset it between runs. In a traditional Rails application, I usually use factory_bot to do this. I took another moment to appreciate Rails and all the things it brings to my life. I double checked to see if time travel had become possible, so I could restart the project with what I had learned. It looked like I was just going to need to build some tools myself.
The part where our hero gains great appreciation for the tools she can no longer use and makes new ones
First, I created a schema to replicate the portions of the database I needed for tests. I began by combing through the queries in the service and making note of every table name and column I used. I quickly realized that my existing specs could help with this! When I ran a spec that used a table or column that was missing, I would get an error message with the missing database object. I kept adding tables to my local test database until none of the spec failure messages were about a missing database object. I now had the minimum schema necessary to test the service.
Our service relied on a custom function in the database. I needed to be able to call this function for the tests to work properly, but it wasn’t necessary to completely replicate the function and its logic. Instead, I created a table into which I could insert the inputs and outputs of the function. Then I created a function that would look up a row given the input and return the output. This allowed me to mock the database function call without changing the database query.
As I built the database through this process, I kept track of the SQL commands I
used to create tables. These commands together became a schema that could be
loaded into the database when setting it up for tests. I created a DataHelper
module to use in specs that would use these commands to set up a test database
or clean the test database between examples.
RSpec.configure do |config|
config.before(:suite) do
DataHelpers.setup_database
end
config.before(:each) do
DataHelpers.clean_database
end
end
module DataHelpers
def self.setup_database
drop_tables
drop_address_match_function
create_schemas
load_schema
create_address_match_function
end
...
end
Now I had a test suite that connected to a local MSSQL database and could set up and clean the database when running tests. All I had left was to create data during tests. Since I didn’t have objects defined in the service to encapsulate the data yet (have I mentioned how much I wish I had used Active Record?), I couldn’t rely on factory_bot to create data. Instead, I created methods to run insert queries with reasonable defaults and called them the way I would use factory bot. Unlike factory_bot, these methods do not return the object that was created, but they worked well enough without that.
def self.create_user(email, full_name, admin = false)
client.execute("INSERT INTO User(
Email,
FullName,
Admin)
VALUES ( #{email}, #{full_name}, #{admin})").do
end
At this point, I had a test suite that connected to a test MSSQL database, actually tested the database queries, and set up/cleaned up the database so it worked for multiple test runs.
The part where our hero reflects on the lesson she learned the hard way
When I began writing this Ruby service, I assumed I would simply be connecting to an external database, running a few queries, and forwarding on the results to our main Rails application. Surely something this straightforward doesn’t need all of Rails! I can just make some little helper classes and move on. It turns out, the service itself really does “simply” connect to the database, make queries, and send data to the app, but simple rarely means easy.
When I chose my tools, I was only considering writing and running the code. I didn’t take any time to think through the best way to test it. If I had begun my design and planning with a testing strategy, I may have discovered the pitfalls I encountered before I started wishing for a time machine. Test-driven development doesn’t start when we write the first line of code. Next time, our hero will consider testing as part of every decision.