Meet Fiber, Thread's Cooperative Cousin

Many developers will be familiar with Ruby’s Thread, but perhaps lesser known is another of its concurrency primitives, Fiber.

As with threads, we can use fibers to create code blocks for concurrent execution. Here’s a simple example:

odd_fiber = Fiber.new do
  Fiber.yield 1
  3
end

even_fiber = Fiber.new do
  Fiber.yield 2
  4
end

puts odd_fiber.resume
puts even_fiber.resume
puts odd_fiber.resume
puts even_fiber.resume

This code will output:

1
2
3
4

As this example shows, fibers can be paused and restarted with yield and resume. This is possible with threads too, but typically switching between threads is handled by the operating system, a process known as preemptive scheduling. Fibers, by contrast, are cooperatively scheduled: they must voluntarily yield control, usually at the behest of the programmer.

That fibers are exclusively managed within the Ruby VM reveals another important difference: they are more lightweight than threads. It’s faster to create and switch between fibers, and we can create more of them at the same time.

Fibers in action

We might not come across fibers directly in the applications we work on. It turns out, though, that a lightweight construct that allows execution to be paused and resumed can enable some powerful higher-level abstractions. Let’s take a look at some examples in the Ruby ecosystem.

Ruby’s Enumerator

If not called with a block, Ruby’s enumerable methods will return an enumerator. This class allows us to iterate externally over a collection by calling next:

irb(main):001:0> enumerator = [1, 2, 3].reverse_each
=> #<Enumerator: ...>
irb(main):002:0> enumerator.next
=> 3
irb(main):003:0> enumerator.next
=> 2
irb(main):004:0> enumerator.next
=> 1

Under the hood this is implemented with a fiber that yields each value in turn, pausing execution and resuming when the next value is requested.

Batch loading: graphql-ruby’s Dataloader

When implementing a GraphQL API, it’s easy to generate N+1 queries when resolving the same field for objects in a collection (fetching all the posts by each of a collection of users, for example). The solution is to batch load all the required data upfront before resolving the field for each object.

The graphql-ruby gem recently introduced a new batch loading mechanism that works by resolving fields inside a fiber. Fibers for all sibling fields are created and paused, the data for these fields is batch loaded with a single query, and the fibers are resumed and the fields resolved.

Async

The async gem takes advantage of new capabilities in Ruby 3 to implement efficient concurrent handling of IO-bound operations. Ruby 3 introduced non-blocking fibers, which can yield to a scheduler when waiting on IO. Async implements a fiber scheduler and provides an API that allows us to create tasks, wrapped in fibers, for asynchronous execution. The following code will complete in two seconds, rather than the three required to run it synchronously:

require 'async'

Async do |task|
  task.async do
    sleep 2
    puts 'Goodbye'
  end

  task.async do
    sleep 1
    puts 'Hello'
  end
end
Hello
Goodbye

This is perhaps the most interesting use case for fibers. Ruby performance is a hot topic, and leveraging fibers for asynchronous processing of IO-bound workloads is an important development.

Fibers are interesting in their own right, as an example of a type of virtualised concurrency primitive that exists across many programming languages. As we’ve seen, they also have some important applications, particularly in the performance space. They are a good fit for concurrent tasks that require manual scheduling. And, as of Ruby 3, they offer a potential alternative to threads for efficient processing of IO-bound workloads.