---
title: Modeling a Paginated API as a Lazy Stream
teaser: 'Pagination of APIs is a performance optimization trick. As a consumer, you
  may want to model things with a lazy stream instead.

  '
tags: web,ruby
author: Joël Quenneville
published_on: 2017-03-29
---

You are integrating with a 3rd party application that contains statistics on the
most popular baby names for a given year.

You have both high-level stats and per-name information you'd like to display.
It'd be nice if you could write the code like this:

```ruby
class NamesController < ApplicationController
  def index
    @names = Names::Client.all_names
  end

  def show
    @name = Names::Client.find_name(params[:name])
  end
end
```

## The Client

You could write a simple client like this:

```ruby
module Names
  class Client
    Name = Struct.new(:id, :name, :births)
    BASE_URL = "http://name-service.com"

    def all_names
      fetch_data("/users").
        map { |data| convert_to_name(data) }
    end

    private

    def fetch_data(path)
      HTTParty.get(BASE_URL + path).
    end

    def convert_to_name(data)
      Name.new(data["id"], data["name"], data["births"])
    end
  end
end
```

Seems straightforward enough. Almost too easy. You're about to hit your first
roadblock.

## Pagination

As you start using the API, you notice that some results seem to be missing. You
take a closer look and notice that you're always getting exactly 10 results from
the API. _The same 10 results_. Aha! Looks like pagination!

Like many APIs, this one paginates its data for performance since it's a really
large set. The items per page seems to be hard-coded to 10.

You could write a method that fetches the 10 results for a given page number but
that's not how your application uses the data. You would like to be able to deal
with the data as a single list. _Breaking the data up into pages is an
implementation detail of the API._

It would be nice to model the data as a stream of data instead. Specifically, a
_lazy_ stream so that we only make the minimum number of HTTP requests. Enter
the `Enumerator`.

## Enumerator

You add a new method to the client to work with paginated results. This fetches
a page and then yields the results one at a time until it runs out of local
results. Then it makes a request for the next page and starts the process over
again. The enumeration ends once an HTTP request responds with a non-200
response.

```ruby
def fetch_paginated_data(path)
  Enumerator.new do |yielder|
    page = 1

    loop do
      results = fetch_data("#{path}?page=#{page}")

      if results.success?
        results.map { |item| yielder << item }
        page += 1
      else
        raise StopIteration
      end
    end
  end.lazy
end
```

Note that appending `?page=#{page}` to the end of the path is a bit naive and
will only work with URLs that don't have any other query parameters. For more
complex URLs, you will want to use Ruby's [`URI`] library.

The client's public `all_names` method doesn't change much. The only difference
is that it calls `fetch_paginated_data` instead of `fetch_data`.

The API you're integrating against returns an HTTP 404 response code for pages
with no results so the Enumerator stops iterating when it gets a non-successful
status code. For other API implementations, it may make sense to check on empty
results instead. Some APIs provide links to the "next" page so you would check
on that. The [Bootic client] has an example of this approach.

```ruby
module Names
  class Client
    Name = Struct.new(:id, :name, :births)
    BASE_URL = "http://name-service.com"

    def all_names
      fetch_paginated_data("/users").
        map { |data| convert_to_name(data) }
    end
  end
end
```

[`URI`]: http://ruby-doc.com/stdlib-2.4.1/libdoc/uri/rdoc/URI.html
[Bootic client]:
https://github.com/bootic/bootic_client.rb/blob/master/lib/bootic_client/entity.rb#L12-L23

## The show page

Going back to our controller implementation:

```ruby
class NamesController < ApplicationController
  def index
    @names = Names::Client.all_names
  end

  def show
    @name = Names::Client.find_name(params[:name])
  end
end
```

Getting all names now works the way you'd expect. But what about that show
action? The API doesn't provide a way to search. You could get the all the
results and then filter them in Ruby but that would cause a lot of useless HTTP
requests. How can you make the minimum number of requests to get the name you
want?

This is where the [lazy Enumerator] really pays off. This code does the minimum
work needed to get us a result.

```ruby
def find_name(name)
  all_names.detect { |n| n.name == name }
end
```

Too simple? Time to try it out! Sofia is the 28th name on the list (and
therefore should be on page 3). If all works the way you expect the client
should only make requests for pages 1, 2, and 3 and stop once it finds Sofia.

![trying out the code](https://images.thoughtbot.com/jq-lazy-stream-api/FJ6AHTMTgeUUN357Lhj0_lazy-streams-client-server.png)

Success!

[lazy Enumerator]: https://ruby-doc.org/core-2.3.0/Enumerator/Lazy.html

## Extra

Want to play around with this concept? The code for the client as well as a
sample server can be found [on
GitHub](https://github.com/JoelQ/lazy-streams-api).

The list of names used came from the US Social Security Administration's list of
[most popular names of 2015](https://www.ssa.gov/cgi-bin/popularnames.cgi)

Check out this article on [lazy refactoring] for a different use case of lazy
enumerators.

[lazy refactoring]: https://thoughtbot.com/blog/lazy-refactoring
