This article assumes you’re familiar with Git, code reviews, and continuous integration. If you’re not, you might want to familiarise yourself with those concepts first, as they are great tools for writing and maintaining good software.
The Problem
Since you’ve made it this far, I bet this situation will sound familiar to you:
You’ve been working on your feature branch for a few hours, and you’re almost
ready. The UX is slick, and the tests are green. It’s time to
create a pull request! But before doing so, you make sure your branch contains
the latest changes merged to master. And as the diligent developer you are,
after rebasing your branch on top of the latest changes, you rerun your test
suite and check your app to make sure everything still works. As you try to
start your app, you notice you need to run the latest migrations or maybe
update your dependencies. No biggie. You run rails db:migrate
, or
yarn install
, and all is well. Except for git status
, which now shows
changes to db/schema.rb
or yarn.lock
. That’s never a great feeling.
What if you could avoid having to feel that again by just adding a few lines of code to your CI pipeline? Well, you totally can.
Why the Problem Happens
Two things can cause the problem I mentioned above:
- An autogenerated file was committed, but the code that generates it was
not.
For example, someone edited
package.json
and ranyarn install
locally and then added the changes onyarn.lock
but forgot to add the changes onpackage.json
to their commit. - The code that generates the file was added, but the autogenerated file was not. As in when someone committed a new migration but forgot to add the updated database schema.
We’ve all made mistakes like these in the past, or missed these aspects during a code review. And that’s ok. In my opinion, if a computer generates a file, then a computer should be in charge of making sure that file is up to date. For more on the topic, read further considerations.
The Solution
Add a step to your CI pipeline that will fail the build if any autogenerated files change.
There are multiple ways of doing so, but the key step is having your CI pipeline generate all autogenerated files. You can then fail the pipeline by detecting changes (maybe using Git) or by removing write permissions on the autogenerated files before running the commands that update those files.
Removing Write Permission
Let’s say you have a file that keeps track of your database schema called
db/schema.rb
. We can start by locking that file against writes:
chmod 0444 db/schema.rb
Next, we try to generate a fresh version of that file. If the schema changes, the script will try to rewrite to the schema file and will generate an error that will fail our build.
bundle exec rails db:create db:migrate
This example uses a standard Ruby on Rails app to showcase the technique, but it applies to any language or framework.
Checking for Changes
With this technique, we first generate a fresh version of our file:
yarn install
And then we check for changes and exit with an error code if we find any:
if [ "$(git diff)" ]; then
echo "Oops, something changed!"
exit 1
fi
Fancy CLIs
Some tools have this functionality built-in. Bundler has a deployment mode, while Yarn has a frozen mode.
Further Considerations
Benefits
- Consistency across environments: This means the code running on your machine is more likely to resemble the code running on your colleague’s machine, and the code running in production. That’s nice! Have you ever tried to find a bug that only exists in staging, and no one knows why? That’s not nice at all.
- More focused code reviews: I believe code reviews are great for sharing knowledge and discussing ideas and concepts. I think it sucks for checking the right bytes were placed in the right place. Get yourself a linter, and generate your autogenerated files, of course!
Drawbacks
Generating all files with each CI build means your builds will take longer to
finish. Running rails db:migrate
, for example, is much slower than recreating
your database from the schema dump.