---
title: A Broader Take on Parsing
teaser: Broadening our definition of "parsing" yields some useful insights.
tags: elm,ruby,parsing,web
author: Joël Quenneville
published_on: 2021-10-14
---

We usually think of "parsing" as turning strings into richer data structures.
Broadening the definition gives us a really nice mental model to think about
many types of transformations.

<small>
Inspired by a <a
href="https://discourse.elm-lang.org/t/domain-driven-type-narrowing/7753/39">discussion
on narrowing types</a> from the Elm discourse.
</small>

## What is parsing?

In prose, we might say: "Parsing is transforming a broader type into a narrower
type, with potential failures".

Described as an Elm type signature,  we can say that parsing is a function:

```elm
parse : broaderType -> Result error narrowerType
```

Typically, not all values of the broader type can successfully be transformed
into the narrower type, hence the need for `Result`. For example, the string
`"1"` can be transformed into an integer but the string `"hello"` cannot.

## Parsing non-strings

Under this definition, one can "parse" rich data types into other rich data
types. This often comes up when capturing user input. I often define both a
broader type that captures user input with a lot of uncertainty and a narrower
type that accurately describes the data I want to capture.

```elm
-- Broad type with a lot of uncertainty

type alias ProfileForm =
  { age : String
  , occupation : Maybe Occupation
  }


-- Narrow type

type alias Profile =
  { age : Int
  , occupation : Occupation
  }
```

Then, in response to some action such as the user clicking a submit button,  I
can try to parse the narrow `Profile` out of the `ProfileForm` with something
like:

```elm
parseProfile : ProfileForm -> Maybe Profile
parseProfile form =
  Maybe.map2 Profile
    (String.toInt form.age)
    form.occupation
```

On a fancier form, you might use a dedicated [form-decoder library] to parse
these records.

[form-decoder library]: https://arow.info/posts/2019/form-decoding/

## Outputs can also be inputs

One can transform data in several passes, with the parsed output at each
step becoming the raw input of the next step. The types get narrower and
narrower at every step and the pipeline acts as a funnel.

For example, when getting data from an API, we might:

1. Try to parse the string body of the response as a JSON value. This might
   fail because not all strings are valid JSON (note that the `elm/json`
   library does this automatically for us)
2. Try and parse the JSON value into a broad `UserSubmission` record. This might
   fail because not all JSON values are valid user submissions.
3. We might combine this `UserSubmission` with some user input and then try to
   parse it into a narrower `User` type. This might fail because not all user
   submissions are valid users.

![Diagram showing 4 rectangles one on top of each other. Each rectangle is smaller than the one above, creating a sort of funnel. On the left, a series of arrows point from each rectangle to the one below it and say "parse". The four rectangles are labeled as follows from top to bottom: 1. String 2. Json.Decode.Value 3. UserSubmission 4. User](https://images.thoughtbot.com/jq-broader-parsing/dSkYbLIYQyKy9B7PL009_funnel-of-parsing.png)

Narrowing types like this is the core idea of [parse, don't validate].

Note that these transformations don't need to happen all at once. Each level
might have some domain-relevant operations that you might want to use and you
may not need to parse to the next level until certain events happen.

[parse, don't validate]: https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/

## Not just for types

This mental model doesn't just apply to statically-typed functional languages.
Consider how a Ruby program might build [value objects] out of the payload from a
3rd party API.

```ruby
Movie = Struct.new(:name, :director)

def build_movies(response)
  json = JSON.parse response.body # parse string into array of hashes

  # parse hashes into movie objects
  movies = json.map do |obj|
    Movie.new(obj.fetch("movie_name"), obj.fetch("dir"))
  end
end
```

Here, failure happens via an exception but the concept is the same. We slowly
move through a funnel from less structured to more structured data as we parse
from strings to hashes and finally into movie objects.

[value objects]: https://thoughtbot.com/upcase/videos/value-objects

## Thinking in transformations

Once you have this broader mental model of parsing in your mind, you will start
to see it everywhere. So much of our work as software developers is transforming
data. On the web, we are constantly dealing with unstructured inputs from APIs
and from users.

Thinking in terms of parsing can help us become more conscious of the boundaries
within our systems and be more intentional in setting them. Because we know that
the transformation at each layer can result in errors, this mental model can
guide us to the hot spots where we need [better test coverage] and
error-handling code.

[better test coverage]: https://thoughtbot.com/blog/testing-your-edge-cases
