An Elm debugging story

I recently discovered a new way to trigger a stack overflow in Elm - chaining 10K random generators 🙈

Understanding why this happens and how to resolve it was quite the journey. Along the way I had to bust out several different debugging techniques including proving myself wrong and reasoning by analogy.

-- Fine with 1K, stack overflow for 10K
sprawlOut : Int -> Location -> Generator ( Location, Set Location )
sprawlOut depth loc =
  List.foldl
    (\_ acc -> R.andThen sprawl acc)
    (R.constant ( loc, Set.empty ))
    (List.range 1 depth)

sprawl : ( Location, Set Location ) -> Generator ( Location, Set Location )
sprawl ( loc, set ) =
  R.uniform loc (neighborsOf loc)
    |> R.map (\next -> ( next, Set.insert next set ))

See the full original problem in action

Simplify the problem

That random generator is intimidating! 😱😱😱

It’s doing two things: generating locations (via the sprawl generator), and accumulating them in a set.

It might be easier to figure out the answer if we work with a simpler problem.

Let’s replace that sprawl generator with one that generates random numbers.

dieRoll : Generator Int
dieRoll =
    Random.int 1 6

We can extract the accumulation part to a separate function and use a plain old list instead of a Set. Also, it turns out all the rolls are independent so we can use map2 instead of andThen.

combine : List (Generator a) -> Generator (List a)
combine =
    List.foldl (Random.map2 (::)) (Random.constant [])

Put those two together, our simplified problem looks like:

-- Stack overflow!
List.map (always dieRoll) (List.range 0 10000)
  |> combine

We’re still getting a stack overflow. That’s good! If we didn’t that would mean we’d oversimplified, and removed something essential to reproducing the bug along the way. As it stands, if we can find a way to generate 10K random die rolls, we’ll also be able to use the same solution to fix our original problem.

See the simplified problem in action.

Construct experiments

Make a guess about why things are breaking. Then try to prove yourself wrong!

Building up large lists of generators cause stack overflow

-- Works fine
List.map (always dieRoll) (List.range 0 10000)

This doesn’t stack overflow so that’s not the cause of our error.

Process of elimination

Time to scale! List some of the possible reasons the code might be broken, then try to prove them all wrong. At each step you’ll learn something new.

We’ve already proven that building a large list of generators is fine. The error must be in combine.

Looking into the implementation, three possibilities jump to mind:

  1. :: breaks on large lists
  2. List.foldl breaks on large lists
  3. Random.map2 (::) breaks on large lists

Lets try to prove all them wrong!

-- No problem!
0 :: (List.range 1 10000)


-- No problem!
List.foldl (::) [] (List.range 1 10000)


-- No problem!
Random.map2 (::)
  (Random.constant 0)
  (Random.constant <| List.range 1 10000)

Hmmm, all of the individual pieces work fine on their own. Maybe the problem lies with how they play together 🤔

How are things the same?

combine is a pretty common function that can be defined for many types. For example, you could also define it for Maybe:

combineMaybes : List (Maybe a) -> Maybe (List a)
combineMaybes =
  List.foldl (Maybe.map2 (::)) (Just [])

and for Result

combineResults : List (Result e a) -> Result e (List a)
combineResults =
  List.foldl (Result.map2 (::)) (Ok [])

Time for a new hypothesis:

There is a fundamental flaw in the algorithm for combine that breaks down at larger list sizes.

Let’s try to disprove it!

-- No problem!
List.map Just (List.range 1 10000)
  |> combineMaybes


-- No problem!
List.map Ok (List.range 1 10000)
  |> combineResults

Hmmm, combine seems to be working fine here. Let’s try with a more complicated type for good measure:

-- Stack overflow!
List.map Decode.succeed (List.range 1 10000)
  |> combineDecoders

Huh, why does combine fail for large lists of generators and decoders but not for maybes and results? Or put differently, how are the Generator and Decoder types similar to each other, but different from Maybe and Result?

Looking at the source for Random, we can see that Generator is actually a function under the hood:

type Generator a =
    Generator (Seed -> (a, Seed))

and Random.map2 chains those functions under the hood to thread the Seed to both generators.

The Decoder type is implemented in JavaScript for efficiency, but if one were to implement it in Elm it would also be a function under the hood:

-- What a pure Elm implementation could look like:
type Decoder a =
  Decoder (Value -> Result Error a)

So it looks like combine breaks on types that wrap a function. That’s surprising. It would be like saying that composing a function 10K times doesn’t work!

Just for thoroughness, let’s try to prove that last statement wrong:

increment n = n + 1

-- Stack overflow!
List.map (always increment) (List.range 1 10000)
  |> List.foldl (<<) identity

🤯🤯🤯

This is HUGE! Not just because I made an incorrect assumption (it happens!), but because this discovery is going to enable me to bring a powerful debugging technique to bear - reasoning by analogy.

Reasoning by analogy

Reasoning by analogy is a problem-solving technique that goes like this:

  1. You have a hard problem you don’t know how to solve
  2. Find an easy problem that’s similar
  3. Figure out the answer to the easy problem
  4. Use the same solution to solve the hard problem

We’ve already reached for a variation of this technique when we simplified our generator right at the beginning! Our hard problem is “generating the complex set of locations causes stack overflow for large sizes”. We simplified it to “generating 10K random numbers causes stack overflow”. If we can solve that, then we can retrofit the solution onto our original hard problem.

However, even our “easier problem” is proving difficult to solve. We’ve used some other debugging techniques to eliminate some possibilities and gain better understanding of what might be causing the problem.

We’ve also discovered an even easier problem that’s similar: “composing a large number of functions causes a stack overflow”. This time however, it’s a problem we already know the solution to! It’s definitely possible to increment 10K times in a stack safe way.

We can do this as a recursion or via List.foldl:

-- Works fine!
List.range 1 10000
  |> List.foldl (\_ acc -> increment acc) 0

So what’s the difference between this and folding <<? Let’s look at each step of the loop:

-- List.foldl (<<)
--
-- Each step composes `increment` and the
-- return of the previous step
(x -> increment x)
(x -> increment (increment x)) 
(x -> increment (increment (increment x))
(x -> increment (increment (increment (increment x))
...
-- stack overflow!
-- List.foldl (\_ x -> increment x)
--
-- Each step increments the return of the previous step
0
1
2
3
...
10000

When folding composition, we’re building a giant nested function but we never execute any of the steps. 10K nested function calls is enough to blow the stack on my machine.

When folding increment directly, we’re eagerly evaluating each step. We aren’t building deeper and deeper functions, instead we do the work immediately at each step and increment the number.


Now that we have some facts, let’s do some reasoning! Given that:

  1. 10K chains of Random.map2 causes a stack overflow
  2. 10K compositions of increment causes a stack overflow
  3. I can call increment 10K times on itself if each step is eagerly evaluated.

Therefore (by analogy) I should also be able to accumulate the results of 10K random generators as long as each step is eagerly evaluated.

reasoning by analogy

Working our way back to the original problem

So how does one “eagerly” evaluate a random generator? If we have a seed we can use Random.step to manually evaluate the generator, giving us back a random value immediately.

Let’s try those 10K random dice rolls from earlier, but now with an eager approach:

-- This works fine!
numbers : List Int
numbers =
  List.range 0 10000
    |> List.foldl numbersHelp ( [], Random.initialSeed 1234 )
    |> Tuple.first -- don't need the seed anymore

numbersHelp : a -> ( List Int, Seed ) -> ( List Int, Seed )
numbersHelp _ ( acc, seed ) =
    let
        ( newAcc, newSeed ) =
            Random.step dieRoll seed
    in
    ( newAcc :: acc, newSeed )

Oh look! That works! Looks like our reasoning by analogy pointed us to a working solution!

If we can solve our simplified 10K random dice rolls problem, we should also be able to use a similar solution for the original problem.

sprawlOut : Int -> Location -> R.Seed -> (( Location, Set Location ), R.Seed)
sprawlOut depth loc seed0 =
  List.foldl
    (\_ (acc, nextSeed) -> R.step (sprawl acc) nextSeed)
    (( loc, Set.empty ), seed0)
    (List.range 1 depth)


sprawl : ( Location, Set Location ) -> Generator ( Location, Set Location )
sprawl ( loc, set ) =
  R.uniform loc (neighborsOf loc)
    |> R.map (\next -> ( next, Set.insert next set ))

It’s not particularly nice, but it works! Success!

See the final solution in action

Conclusion

Several debugging techniques were used in the course of solving this problem. Simplifying the problem is always a good place to start. It’s a variation on reasoning by analogy, although analogies are allowed to be more than just simplified versions of your original problem.

When you’re debugging a problem, you will have a lot of assumptions. Test all of them and attempt to disprove them.

Look at your code and try to list the ways it could break. Then go through them all and try to see if they can trigger your bug. Use process of elimination to narrow the focus of your search. Make sure to test all your assumptions along the way though!

Once you have some data about some combinations that work and some that break, ask yourself “how are these the same?” and “how are these different?”.

Discussion around stack overflow seen in this article and its solution led to a PR against Random.Extra to add some stack-safe helper functions.