I was working on an app that generated a Markdown article. The article content had some dynamic parts that were fetched via HTTP requests. While not a huge problem, this made the article generation slow.
Ruby 3.0 introduced the fiber scheduler interface, which is used by the async
gem to run tasks concurrently. It’s particularly useful for I/O-bound
workloads, so I decided to give it a try. This post is a summary of my journey
in figuring out how to use it.
If you don’t care about any of this, skip to the final thoughts section.
The problem
The article generation code looked like this (I’m using sleep
to simulate the
HTTP requests time):
class Article
def to_s
<<~MARKDOWN
# #{generate_title}
#{generate_content}
MARKDOWN
end
def generate_title
sleep 2
"A title"
end
def generate_content
5.times.map { |i|
generate_paragraph(i)
}.join("\n")
end
private
def generate_paragraph(i)
sleep 1
"Paragraph #{i}"
end
end
t0 = Process.clock_gettime(Process::CLOCK_MONOTONIC)
Article.new.to_s
t1 = Process.clock_gettime(Process::CLOCK_MONOTONIC)
puts "Time: #{t1 - t0} seconds."
This takes about 7 seconds to run (1 second for each of the 5 paragraphs plus 2 seconds for the title).
The journey
After installing the async
gem, the first thing I did was wrap the
whole code in an Async
block as all the examples did.
require "async"
class Article
def to_s
Async do
<<~MARKDOWN
# #{generate_title}
#{generate_content}
MARKDOWN
end
end
# ...
end
Re-running the code, it still runs in seven seconds, and now instead of the article
body, I got back an Async::Task
object.
If I want the result, I need to call #wait
on the task.
def to_s
Async do
<<~MARKDOWN
# #{generate_title}
#{generate_content}
MARKDOWN
end.wait # <--- this
end
From the docs, it looks like I can replace this Async { }.wait
pattern with
Sync { }
def to_s
Sync do
<<~MARKDOWN
# #{generate_title}
#{generate_content}
MARKDOWN
end
end
Nothing is running asynchronously yet, so let’s try starting with the paragraphs:
def generate_content
5.times.map { |i|
Async { generate_paragraph(i) }
}.join("\n")
end
This makes each loop async, and the code runs in 3 seconds. Again, we
don’t have the values for the paragraphs, just ‘tasks’. Let’s add wait
again:
def generate_content
5.times.map { |i|
Async { generate_paragraph(i) }.wait
}.join("\n")
end
Waiting on each async does get the value back, but now everything is running synchronously again, i.e., in 7 seconds. What?!
The “How The Heck Do I Make This Work?” Section
I tried to wrap the whole thing in an Async + wait
block with internal async tasks, but
it also didn’t work.
def generate_content
Async do
5.times.map { |i|
Async { generate_paragraph(i) }
}.join("\n")
end.wait
end
Ok, maybe the problem is using #join
right after creating the tasks, which
wouldn’t give them time to finish. Against my will, I iteratively built a list:
def generate_content
paragraphs = []
Sync do
5.times do |i|
Async do
paragraphs << generate_paragraph(i)
end
end
end
paragraphs.join("\n")
end
I was surprised this didn’t work. For some reason, the paragraphs are empty! I
thought the Sync
block would wait for the internal Async
blocks to finish,
but it didn’t.
A Solution
After fighting with this for a while, reading the docs and the source code, I finally got it. I had to wait for the tasks after creating all of them, not right after creating one of them.
def generate_content
5.times.map { |i|
Async do
generate_paragraph(i)
end
}
.map(&:wait) # <--- wait after creating all tasks
.join("\n")
end
It works! This takes 3 seconds to run, as expected (only the paragraph
generation is async for now). Bonus points: #map
kept working!
Here’s a visual representation of the difference between the two approaches:
We have to change Article#to_s
to generate the title and the content in
parallel:
def to_s
Sync do
# We cannot use `Sync` here, because the first task would block the following one
title_task = Async { generate_title }
content_task = Async { generate_content }
title = title_task.wait
content = content_task.wait
<<~MARKDOWN
# #{title}
#{content}
MARKDOWN
end
end
This is fully concurrent now. It takes only 2 seconds to run, a 3.5x speedup!
It’s a bit boring having to #wait
, but it’s not a big deal. Here’s the
full diagram of the execution:
Thoughts on the Async
gem
That was an interesting experience for me. I’ve seldom used Threads
in Ruby
because they feel easy to mess up, so I was curious to see how Async
would
work. Here are a few thoughts on it:
The good
I didn’t have to change my code much to make it async.
Any other code using the Article
class would work as before, without
knowing it’s asynchronous. Is this what people mean by it having no
function colors?
It’s very scalable.
Given that you have an I/O bound problem, you can easily add more tasks to run concurrently and they’re lightweight (orders of magnitude lighter than Threads).
It is the “official” gem for this kind of problem
It looks like Matz himself invited the gem to core Ruby, but I couldn’t find where/when this happened. Samuel Williams, the author, is a core contributor to Ruby and has merged the fiber scheduler interface in Ruby 3.0.
The bad
The docs are… lacking
Documentation and examples are scarce. The guides are brief, and this blog post was one of the only examples I could find.
Sometimes I needed to dig into the code to understand how to use it, which is not an unusual thing to do, but it’s not ideal for simple use cases. I often had questions I couldn’t find answers to in the docs, like:
- Why use
Async { }.wait
vsSync { }
? The documentation says they’re “very similar”, which leaves me wondering where would they differ. - What’s the difference (if any) between nested
Async
blocks and usingAsync { |task| task.async { ... } }
? - Why use
Async::Barrier
vs multiple waits?
There’s indeed a call for better docs in the repo.
It’s not compatible with every library
I’m lucky my use case was covered, but if this was a Rails app, for instance, it
wouldn’t be possible to use Async
to run queries in parallel. It does work
with Sequel, though. It’s not a problem with the gem itself, but it’s
something to keep in mind.
Wrapping up
All in all, this was a fun experiment and I did get a good speedup. I think this ecosystem is promising and I’m looking forward to seeing more libraries supporting it. The biggest “problem” I had was the lack of documentation, but this is something that we, as a community, can help with.
Continue the journey
Do you have a Rails project that could benefit by parterning with some of the most experienced Ruby on Rails developers in elevating your product? Reach out on how we approach creating high-quality experiences.