---
title: Applicative Options Parsing in Haskell
teaser:
tags: web,haskell
author: Pat Brisbin
published_on: 2014-06-13
---

I've just finished work on a small command line [client] for the
[Heroku Build API][api] written in Haskell. It may be a bit overkill for
the task, but it allowed me to play with a library I was very interested
in but hadn't had a chance to use yet: [optparse-applicative][optparse].

[client]: https://github.com/pbrisbin/heroku-build
[api]: https://devcenter.heroku.com/articles/platform-api-reference#build
[optparse]: https://hackage.haskell.org/package/optparse-applicative

In figuring things out, I again noticed something I find common to many
Haskell libraries:

1. It's extremely easy to use and solves the problem exactly as I need.
1. It's woefully under-documented and appears incredibly difficult to
   use at first glance.

Note that when I say under-documented, I mean it in a very specific way.
The [Haddocks][docs] are stellar. Unfortunately, what I find lacking are
blogs and example-driven tutorials.

[docs]: https://hackage.haskell.org/package/optparse-applicative-0.9.0/docs/Options-Applicative-Builder.html

Rather than complain about the lack of tutorials, I've decided to write
one.

## Applicative Parsers

Haskell is known for its great parsing libraries and this is no
exception. For some context, here's an example of what it looks like to
build a Parser in Haskell:

```haskell
type CSV = [[String]]

csvFile :: Parser CSV
csvFile = do
    lines <- many csvLine
    eof

    return lines

  where
    csvLine = do
        cells <- many csvCell `sepBy` comma
        eol

        return cells

    csvCell = quoted (many anyChar)

    comma = char ','

    eol = char '\n' <|> char '\r\n'

    -- etc...
```

As you can see, Haskell parsers have a fractal nature. You make tiny
parsers for simple values and combine them into slightly larger parsers
for slightly more complicated values. You continue this process until
you reach the top level `csvFile` which reads like exactly what it is.

When combining parsers from a general-purpose library like [parsec]
(as we're doing above), we typically do it *monadically*. This means
that each parsing step is *sequenced* together (that's what
[do-notation] does) and that sequencing will be respected when the
parser is ultimately executed on some input. Sequencing parsing steps in
an imperative way like this allows us to make decisions mid-parse about
what to do next or to use the results of earlier parses in later ones.
This ability is essential in most cases.

[parsec]: http://hackage.haskell.org/package/parsec
[do-notation]: http://www.haskell.org/haskellwiki/Monad#Special_notation

When using libraries like [optparse-applicative][optparse] and [aeson]
we're able to do something different. Instead of treating parsers as
monadic, we can treat them as applicative. The `Applicative` type class
is a lot like `Monad` in that it's a means of describing combination.
Crucially, it differs in that it has no ability to define an order --
there's no sequencing.

If it helps, you can think of applicative parsers as *atomic* or
*parallel* while monadic parsers would be *incremental* or *serial*. Yet
another way to say it is that monadic parsers operate on the result of
the previous parser and can only return something to the next; the
overall result is then simply the result of the last parser in the
chain. Applicative parsers, on the other hand, operate on the whole
input and contribute directly to the whole output -- when combined and
executed, many applicative parsers can run "at once" to produce the
final result.

Taking values and combining them into a larger value via some
constructor is exactly how normal function application works. The
`Applicative` type class lets you construct things from values wrapped
in some context (say, a *Parser State*) using a very similar syntax. By
using `Applicative` to combine smaller parsers into larger ones, you end
up with a very convenient situation: the constructed parsers resemble
the structure of their *output*, not their *input*.

[aeson]: http://hackage.haskell.org/package/aeson

When you look at the CSV parser above, it reads like the document it's
parsing, not the value it's producing. It doesn't *look like* an array
of arrays, it looks like a walk over the values and down the lines of a
file. There's nothing wrong with this structure per se, but contrast it
with this parser for creating a `User` from a <abbr title="JavaScript Object Notation">JSON</abbr> value:

```haskell
data User = User String Int

-- Value is a type provided by aeson to represent <abbr title="JavaScript Object Notation">JSON</abbr> values.
parseUser :: Value -> Parser User
parseUser (Object o) = User <$> o .: "name" <*> o .: "age"
```

It's hard to believe the two share any qualities at all, but they are in
fact the same thing, just constructed via different means of
combination.

In the CSV case, parsers like `csvLine` and `eof` are combined
monadically via do-notation:

> You will parse many lines of CSV, *then* you will parse an
> end-of-file.

In the <abbr title="JavaScript Object Notation">JSON</abbr> case, parsers like `o .: "name"` and `o .: "age"` each
contribute part of a `User` and those parts are combined applicatively
via [`(<$>)`][fmap] and [`(<*>)`][apply] (pronounced *fmap* and
*apply*):

> You will parse a user from the value for the "name" key *and* the
> value for the "age" key

[fmap]: http://hackage.haskell.org/package/base-4.7.0.0/docs/Control-Applicative.html#v:-60--36--62-
[apply]: http://hackage.haskell.org/package/base-4.7.0.0/docs/Control-Applicative.html#v:-60--42--62-

Just by virtue of how `Applicative` works, we find ourselves with a
`Parser User` that looks surprisingly like a `User`.

I go through all of this not because you need to know about it to use
these libraries (though it does help with understanding their error
messages), but because I think it's a great example of something many
developers don't believe: not only *can* highly theoretic concepts have
tangible value in real world code, but they in fact *do* in Haskell.

Let's see it in action.

## Options Parsing

My little command line client has the following usage:

```sh
heroku-build [--app COMPILE-APP] [start|status|release]
```

Where each sub-command has its own set of arguments:

```sh
heroku-build start SOURCE-URL VERSION
heroku-build status BUILD-ID
heroku-build release BUILD-ID RELEASE-APP
```

The first step is to define a data type for what you want *out* of
options parsing. I typically call this `Options`:

```haskell
import Options.Applicative -- Provided by optparse-applicative

type App = String
type Version = String
type Url = String
type BuildId = String

data Command
    = Start Url Version
    | Status BuildId
    | Release BuildId App

data Options = Options App Command
```

If we assume that we can build a `Parser Options`, using it in `main`
would look like this:

```haskell
main :: IO ()
main = run =<< execParser
    (parseOptions `withInfo` "Interact with the Heroku Build API")

parseOptions :: Parser Options
parseOptions = undefined

-- Actual program logic
run :: Options -> IO ()
run opts = undefined
```

Where `withInfo` is just a convenience function to add `--help` support
given a parser and description:

```haskell
withInfo :: Parser a -> String -> ParserInfo a
withInfo opts desc = info (helper <*> opts) $ progDesc desc
```

So what does an Applicative Options Parser look like? Well, if you
remember the discussion above, it's going to be a series of smaller
parsers combined in an applicative way.

Let's start by parsing just the `--app` option using the
library-provided `strOption` helper:

```haskell
parseApp :: Parser App
parseApp = strOption $
    short 'a' <> long "app" <> metavar "COMPILE-APP" <>
    help "Heroku app on which to compile"
```

Next we make a parser for each sub-command:

```haskell
parseStart :: Parser Command
parseStart = Start
    <$> argument str (metavar "SOURCE-URL")
    <*> argument str (metavar "VERSION")

parseStatus :: Parser Command
parseStatus = Status <$> argument str (metavar "BUILD-ID")

parseRelease :: Parser Command
parseRelease = Release
    <$> argument str (metavar "BUILD-ID")
    <*> argument str (metavar "RELEASE-APP")
```

Looks familiar, right? These parsers are made up of simpler parsers
(like `argument`) combined in much the same way as our `parseUser`
example. We can then combine them further via the `subparser` function:

```haskell
parseCommand :: Parser Command
parseCommand = subparser $
    command "start"   (parseStart   `withInfo` "Start a build on the compilation app") <>
    command "status"  (parseStatus  `withInfo` "Check the status of a build") <>
    command "release" (parseRelease `withInfo` "Release a successful build")
```

By re-using `withInfo` here, we even get sub-command `--help` flags:

```sh
$ heroku-build start --help
Usage: heroku-build start SOURCE-URL VERSION
  Start a build on the compilation app

Available options:
  -h,--help                Show this help text
```

Pretty great, right?

All of this comes together to make the full `Options` parser:

```haskell
parseOptions :: Parser Options
parseOptions = Options <$> parseApp <*> parseCommand
```

Again, this looks just like `parseUser`. You might've thought that
`o .: "name"` was some kind of magic, but as you can see, it's just a
parser. It was defined in the same way as `parseApp`, designed to parse
something simple, and is easily combined into a more complex parser
thanks to its applicative nature.

Finally, with option handling thoroughly taken care of, we're free to
implement our program logic in terms of meaningful types:

```haskell
run :: Options -> IO ()
run (Options app cmd) = do
    case cmd of
        Start url version  -> -- ...
        Status build       -> -- ...
        Release build rApp -> -- ...
```

# Wrapping Up

To recap, optparse-applicative allows us to do a number of things:

- Implement our program input as a meaningful type
- State how to turn command-line options into a value of that type in a
  concise and declarative way
- Do this even in the presence of something complex like sub-commands
- Handle invalid input and get a really great `--help` message for free

Hopefully, this post has piqued some interest in Haskell's deeper ideas
which I believe lead to most of these benefits. If not, at least there's
some real world examples that you can reference the next time you want
to parse command-line options in Haskell.