We recently rewrote the code that powers this blog. Previously, the blog ran as a Middleman app. The new system is tailored to our preferred authoring workflow (Markdown + GitHub) and takes advantage of webhooks to automate tasks that are not writing or reviewing a post.
Splitting content from engine
The idea to rebuild the blog stemmed from a conversation about publishing new blog posts. We love our process of writing posts in Markdown, versioning them via Git and reviewing them via GitHub pull requests. However, in our previous setup, we needed to redeploy the blog to Heroku whenever a new post was published. This was tedious and frustrating.
The ideal workflow would be to merge a pull request for a new blog post and have everything else happen automatically. A big obstacle to this goal was the coupling between the content of our blog and the application code that served it.
This led to the decision to break up our blog into two independent repositories. One would contain the blog engine, written in Rails, while the other would be strictly Markdown documents.
Setting up a GitHub webhook
GitHub allows you to subscribe to events on a repository via a webhook. You provide them with a URL and they will post to it every time the designated event occurs.
When a new post gets merged to the master branch of the content repository, we respond by kicking off an import.
GitHub’s documentation for webhooks is pretty good. Check it out.
For security reasons, we want to restrict access to the webhook URL to only allow payloads from GitHub. GitHub allows you to set a secret key with which the incoming request is signed. If the request signature matches the payload hashed with the secret key, then we know the request is genuine.
Caching with Fastly
We host our blog on Heroku and use Fastly as our CDN. Jessie wrote a fantastic post on how to set up Fastly with a Rails application. We used this approach for the blog engine. When we import a new post, we purge the cache. However, this won’t work for posts that don’t show up immediately on the blog such as those with future dates. In addition, we run a daily task via heroku scheduler that purges the posts.
Initially we were confused by the Article#purge
and Article.purge_all
methods included into our models by the fastly-rails
gem. Article#purge
will
expire all pages that have the surrogate key for that individual article while
Article.purge_all
will expire all pages that have the general article
surrogate key.
Some pages have both, for example:
def index
@articles = Article.recent
set_surrogate_key_header Article.table_key, @articles.map(&:record_key)
end
This index page can be expired by calling Article.purge_all
or by calling
purge
on any of the article objects rendered on that page.
So when should you use one over the other?
- When creating a new object you want to use
purge_all
. This is a new object that isn’t on any page yet sopurge
wouldn’t do anything. - When updating an object, you can use
purge
. This will expire any pages that render that object.
Building a sitemap
Search engines like Google and Bing use XML sitemaps to generate search
results. The Giant Robots sitemap allows us to inform search engines about the
relative importance of each URL on the site and how often they change. The most
popular gem we found for generating sitemaps, SitemapGenerator, generates
static files and suggests setting up a cron job to update it periodically. We
found that it wasn’t difficult to serve our own dynamic sitemap using the
Builder
templates that ship with Rails.
Authoring posts locally
While splitting the content from the engine simplified a lot of things, it did make previewing posts more difficult. Previously, an author could spin up a local Middleman server and preview their post exactly as it would show up on the blog. However, the new engine doesn’t read files from the local repo but imports them from GitHub instead. This would force authors to:
- Set up a local version of the engine
- Connect it to the GitHub repository’s webhook
- Push to GitHub in order for GitHub to send the file back down to their local machine so the engine can render it.
This whole workflow is tedious. We considered using a standard Markdown editor such as Marked to preview the posts but then they wouldn’t be rendered using our stylesheet and layouts.
We decided to implement an author mode that would read Markdown files from the
local file system rather than the database + GitHub. In order to do this, we
built a set of lightweight objects that mimicked our ActiveRecord models.
Local::Article
, Local::Author
, and Local::Tag
. These objects are backed by
the file system rather than the database.
To ensure the correct objects are called by the controller we added the following initializer:
if ENV.fetch("AUTHOR_MODE") == "true" && ENV["LOCAL_POSTS_PATH"].present?
require "local/article"
require "local/tag"
require "local/author"
Article = Local::Article
Author = Local::Author
Tag = Local::Tag
end
You would expect this would cause some “already initialized constant” errors if these redefine constants from our models or if the models get loaded after this initializer. However, this is not the case. Rails’ autoloading system will only load a model file if it finds an undefined constant named for that file. Since our constants are already defined, Rails will never load the models.
Conclusion
Since these changes went live, authoring blog posts is now much more streamlined. We get to focus on writing content using our favorite plain-text editor and getting feedback on GitHub. Once satisfied, we merge our post and it will automatically show up on the blog on the day it was dated for. Magical!