Building Haskell Projects with Halcyon

With our first production Haskell application, Carnival, we found that slow compile times and deployment to Heroku were two pain points. Since that original blog post, a number of projects have made headway attacking these issues in various ways. Of these, the front-runners in my mind are Docker, Nix, and a Bash-based project named Halcyon.

In this post, I want to talk about how we updated Carnival to use Halcyon for local development, Continuous Integration (CI) testing, and deployment. We decided to try Halcyon because it is very actively maintained and implements a great Heroku deployment experience: the Haskell on Heroku buildpack.

Halcyon

Halcyon has extensive documentation and a great tutorial on its site. This post is going to be about our own experience and does not comprehensively describe the tool and all its features. If you want to dive deeper or have any questions, those resources are the place to start.

Halcyon’s general approach is to build everything you need: from GHC, to Cabal, to a project sandbox, to executable tools, to the project itself – all of it within an isolated directory (/app by default). If something changes or you build a different project, the directory is cleared and the process happens again from the start.

This may seem time consuming or wasteful, but it’s not. Everything that is built is cached both locally and on Amazon S3. This means building a project may require little more than extracting a few archives and take only seconds. By isolating everything, always building “from scratch”, and caching as much as possible, we get builds that are robust and repeatable, but also fast.

The things you build and upload to S3 can also be reused across your team, drastically cutting down on the total time spent compiling. Unfortunately, there are two limiting factors:

These programs are compiled and the only way to ensure compatibility is to create per-platform binaries. At thoughtbot, we have a handful of platforms in play: Arch Linux, OS X, and two versions of Ubuntu (used on Heroku and Travis). Even though any required compilation may have to happen four times, this is still an improvement over everyone compiling all the time.

Sandboxes can only be reused at the same absolute path

This is an unfortunate limitation of Cabal: sandboxes are not relocatable. In other words, the sandbox definition contains absolute paths so it can only be used in a directory that matches the absolute path where it was built. If you’ve ever renamed a Haskell project using Cabal sandboxes, you probably noticed that it broke the sandbox. This is a known issue, but there hasn’t been much progress yet.

This is part of the reason for Halcyon using /app by default. The other part is that Heroku slugs are compiled in /app, so if we expect to reuse a cached sandbox and achieve a fast deployment on Heroku, those sandboxes must have also been built in /app.

Installing Halcyon

Everything I’m about to describe can be accomplished automatically by running:

source <( curl -sL https://github.com/mietek/halcyon/raw/master/setup.sh )

Because I’m not a huge fan of recommending remote code execution, I’d prefer to outline the steps this script would go through and show that it’s not difficult to execute them manually. This increases understanding of the underlying system in case something goes wrong. Also, the installation script assumes you’re using Bash, whereas the individual steps I’ll show work in any POSIX shell.

First, there are some system packages required for when you start compiling things. If on Linux, your distro should have some kind of “Build Tools” package (Arch has base-devel, Ubuntu has build-essentials). You’ll need that, along with git and zlib. If on OS X, you’ll want to brew install bash coreutils git. If you run into any issues, you should directly reference the source of the above setup script to find the packages appropriate for your platform.

Halcyon requires a user-writable /app. This path can be configured, but because of the relocation limitation I described earlier, I recommend using the default:

sudo mkdir -p /app sudo chown $USER /app

Halcyon doesn’t care where you install it, but to keep everything together, I prefer to clone it into /app. This is also what setup.sh would do:

git clone https://github.com/mietek/halcyon /app/halcyon

Congrats, you’ve installed Halcyon:

/app/halcyon/halcyon --help

Halcyon build

With the above in place, we can use Halcyon to install all sorts of Haskell projects: GHC, Cabal, a project from Hackage, GitHub, or a local directory. Right now, we’re only interested in the last one.

From within a project directory, run:

/app/halcyon/halcyon build

In the case of Carnival, this installs the appropriate GHC, Cabal, alex, happy, yesod-bin, and project dependencies, then builds the project itself, all into /app. If everything has been built previously (by a teammate using the same platform or you five minutes ago), the whole process takes about 10 seconds. If there’s been a code change and the app itself needs to be recompiled, it may take 45 seconds.

Configuring the build

Depending on the project, if all you did were the above steps, it would probably work, but you wouldn’t be getting the most out of Halcyon. The two most important things not mentioned so far are version constraints (to ensure repeatability) and private S3 storage (to ensure fast incremental builds).

Configuring these things can be done in one of three ways:

Environment variables

These are useful for sensitive values (e.g. AWS keys) or values that will change from environment to environment.

Magic files

These are created under a .halcyon directory in the project. They’re useful for configuration that will always be the same, like version constraints or sandbox extra apps. This is any configuration you’d want to commit alongside the source.

Command-line flags

Flags given at build time are useful for one-time changes, like explicitly triggering a rebuild.

At a minimum, you should set the environment variables for private S3 storage and create a .halcyon/constraints file. Creating the constraints file can be done manually or by running /app/halcyon/halcyon constraints. Also, any constraint-less build will still generate a constraints file (using the latest versions of all dependencies) and can be found in your S3 bucket. This can be especially useful for creating constraints files for any sandbox extra apps.

Developing with Halcyon

In order to use the tools Halcyon has installed, we need to set a few environment variables (e.g. $PATH). Halcyon has a command which outputs some shell which sets these variables:

source <( /app/halcyon/halcyon paths )

If your shell doesn’t support the source keyword or process substitution, the following is a POSIX equivalent:

eval "$(/app/halcyon/halcyon paths)"

I prefer to do this each time I begin work on a Halcyon-based project. If you’d rather have these variables set all the time, you can put this line in your shell startup file (e.g. .bash_profile or .zshenv).

In addition to the environment variables, we also need to ensure that anything requiring a sandbox context gets the appropriate one. Halcyon built the sandbox in /app/sandbox, but we prefer to work directly in the project directory. The solution is to symlink the sandbox config into the current directory:

ln -sf /app/sandbox/cabal.sandbox.config cabal.sandbox.config

All of our normal commands should now work:

yesod devel yesod test cabal exec -- ghci Model.hs

Continuous Integration

Every CI service is different, but all should provide a way to:

Export the required environment variables
Perform a “before” step to create /app, install Halcyon, and set additional environment variables based on halcyon paths

Here’s an example travis.yml based on the one we use for Carnival:

language: sh # Halcyon will handle all Haskell dependencies

env:
  global:
    # http://docs.travis-ci.com/user/environment-variables/#Secure-Variables
    - secure: ... # HALCYON_AWS_ACCESS_KEY_ID
    - secure: ... # HALCYON_AWS_SECRET_ACCESS_KEY
    - secure: ... # HALCYON_S3_BUCKET

before_install:
  - sudo mkdir -p /app
  - sudo chown $USER /app
  - git clone https://github.com/mietek/halcyon.git /app/halcyon

install:
  - /app/halcyon/halcyon build
  - /app/halcyon/halcyon paths > halcyon-env
  - ln -sf /app/sandbox/cabal.sandbox.config cabal.sandbox.config

script:
  - source halcyon-env && cabal configure --enable-tests && cabal test

Deployment

Haskell on Heroku is a buildpack that uses Halcyon internally. If you’ve already built your project for the same platform as your Heroku instance (probably Ubuntu 14.04), you should be able to deploy in seconds:

heroku config:set BUILDPACK_URL=https://github.com/mietek/haskell-on-heroku git push heroku master

The buildpack handles invoking Halcyon to build and install your project under /app, makes the first executable listed in your cabal file available, and auto-generates a Procfile to invoke it (if you’ve not included one yourself).

If you haven’t built your project for this platform yet, and Halcyon is unable to do a completely cached build, it does something interesting instead: it deploys an empty slug containing only itself. It then instructs you to use a one-off dyno to execute the build in an environment without any time limits.

Following those instructions will compile everything that’s required, cache it all to S3, then instruct you to push again with an empty commit. This time, a fully cached build should be possible and your app will deploy in seconds.

Great job, everybody

So is it over? Have we solved Cabal Hell, slow compiles, and failed deployments?

I don’t know. I’m personally holding out for something directly supported within Cabal and Hackage to make things better. Docker and Nix are two general-purpose solutions that could solve these issues in a complete way by accident. Something entirely different like Backpack could come along and revolutionize the whole thing.

No matter what, Halcyon is making dependency management and deployments easier for us at thoughtbot, and that’s good enough for now.