---
title: Preloading Nested Associations with Ecto
teaser: A look at how to preload database associations using Elixir's Ecto library.
tags: elixir,ecto,web
author: Greg Lazarev
published_on: 2017-01-11
---

For the purpose of this article, assume we are building a simple blog app -- our
users can write blog posts and make comments on them. We're interested in the
associations, or relationships, between the models. A single blog post belongs
to a user and has many comments. Any given comment belongs to a user and a blog
post. Finally, a user has many posts and has many comments.

Now, say we want to display a user with a list of their post titles and all of
the comments made on those posts. In such a use case, posts are nested under the
user and comments are nested under posts.

    User -> Post -> Comment

We could start off by writing something along the lines of:

```elixir
user = Blog.Repo.get(Blog.User, ^user_id)

# or via query expression

user = Blog.Repo.one from user in Blog.User,
  where: user.id == ^user_id
```

Trying to access posts on this user will give us an error:

    cannot encode association :posts from Blog.User to JSON because the
    association was not loaded. Please make sure you have preloaded the
    association or remove it from the data to be encoded

It is letting us know that we must explicitly preload the data that we are
accessing. This is good, as it prevents accidental N+1 queries from negatively
affecting our performance. We can preload our associations with:

```elixir
user = Blog.Repo.one from user in Blog.User,
  where: user.id == ^user_id,
  preload: [posts: :comments]
```

Our query works and we can access user's posts and related comments. But, we are
generating three queries -- one to find our user, one to load its posts, and one
for the comments.

```sql
SELECT u0."id", u0."name", u0."email", u0."inserted_at", u0."updated_at"
  FROM "users" AS u0
  WHERE (u0."id" = $1) [1]
SELECT p0."id", p0."title", p0."body", p0."user_id", p0."inserted_at", p0."updated_at", p0."user_id"
  FROM "posts" AS p0
  WHERE (p0."user_id" = $1)
  ORDER BY p0."user_id" [1]
SELECT c0."id", c0."body", c0."user_id", c0."post_id", c0."inserted_at", c0."updated_at", c0."post_id"
  FROM "comments" AS c0
  WHERE (c0."post_id" = $1)
  ORDER BY c0."post_id" [1]
```

This isn't an N+1 problem, since it will only generate a query per preloaded
association and not per each record. However, we've been taught that the less
queries we make, the better it is. Could we have one query to load all the users
plus their posts and comments? Yes, we can achieve that by using SQL joins and
preloading our associations directly into the structs:

```elixir
user = Blog.Repo.one from user in Blog.User,
  where: user.id == ^user_id,
  left_join: posts in assoc(user, :posts),
  left_join: comments in assoc(posts, :comments),
  preload: [posts: {posts, comments: comments}]
```

Success! We preloaded all of our data using SQL joins by generating a single
query:

```elixir
SELECT u0."id", u0."name", u0."email", u0."inserted_at", u0."updated_at", p1."id", p1."title", p1."body", p1."user_id", p1."inserted_at", p1."updated_at", c2."id", c2."body", c2."user_id", c2."post_id", c2."inserted_at", c2."updated_at" FROM "users" AS u0 LEFT OUTER JOIN "posts" AS p1 ON p1."user_id" = u0."id" LEFT OUTER JOIN "comments" AS c2 ON c2."post_id" = p1."id" WHERE (u0."id" = $1) [1]
```

However, it can be a bit cryptic trying to figure out what is going on in that
Ecto expression. Let's break it down:

* First, we create a binding for the user's data, in this case `user` (it can be
  named anything, like `u` or `foo` as long as it's consistent in the query
  expression).
* Next, we tell our `left_join` clause what tables to join -- join the `posts`
  table using the `user_id` foreign key, then join the `comments` table using
  the `post_id` foreign key.
* Finally, we preload the joined data into our `posts` and `comments` struct
  bindings. Again, these can be named anything, but must match throughout the
  entire expression.

The preload syntax for nested associations looks a bit odd at first, but it
really is just a list of key-value pairs (where value can be an atom, a tuple or
a list). Imagine if our users also had multiple file uploads and we wanted to
preload those too, we'd write:

```elixir
preload: [posts: {posts, comments: comments}, uploads: uploads]
```

Ecto provides two different types of syntax for queries: keyword-based and
pipe-based (also called macro-based). So far, we've been using the keyword-based
query syntax. Much of Elixir code is written with the help of the pipe `|>`
operator, thus we'll re-write our query using pipe-based expression which
favors the pipe.

```elixir
user = Blog.User
|> where([user], user.id == ^user_id)
|> join(:left, [user], posts in assoc(user, :posts))
|> join(:left, [user, posts], comments in assoc(posts, :comments))
|> preload([user, posts, comments], [posts: {posts, comments: comments}])
|> Blog.Repo.one
```

Looks nice, doesn't it? But there's a slight disadvantage of having to specify
the bindings for every operation. Because we are chaining individual functions,
the names of the bindings don't have to be consistent across refinement queries,
and we can write our expression as:

```elixir
user = Blog.User
|> where([user], user.id == ^user_id)
|> join(:left, [u], _ in assoc(u, :posts))
|> join(:left, [_, posts], _ in assoc(posts, :comments))
|> preload([_, p, c], [posts: {p, comments: c}])
|> Blog.Repo.one
```

One last thing worth pointing out about bindings is that they are positional.
We cannot re-arrange them like `[comments, user, posts]`. However, if we only
need the first binding, we don't have to specify the rest. Thus, in our join
queries we only specify bindings for user and posts. Further, a `...` operator
allows us to avoid listing out bindings which we don't care about. For example,
say we only needed the first and the last binding in our query expression, we
could write our bindings as `[user, ..., comments]`, independently of how many
bindings are in between (can be zero or many). Since we don't need a binding for
the user in our `preload`, we can write it as:

```elixir
preload([..., posts, comments], [posts: {posts, comments: comments}])
````

You can read more about preloading associations in the [official Ecto docs].
This article was written using Ecto 2.1.2.

[official Ecto docs]: https://hexdocs.pm/ecto/Ecto.Query.html#preload/3
