I remember starting out with Ruby and feeling intimidated by the idea of using threads. I hadn’t come across them in any of the Ruby code I’d seen so far, and didn’t MRI’s Global Interpreter Lock mean that writing threaded code would yield marginal benefits anyway?
Well, it turns out that writing threaded code in Ruby need not be scary, and there are use cases where leveraging threads can make your code more performant without adding a lot of additional complexity.
Before going any further I should mention that a resource I found super useful when leveling up my knowledge of Ruby threading was Jesse Storimer’s excellent eBook Working with Ruby Threads - I can’t recommend it enough.
The Global Interpreter Lock
MRI has a Global Interpreter Lock, often called the GIL, and having a high level understanding of it is important to understanding how we write multi-threaded code in Ruby. Basically the GIL prevents multiple Ruby threads from executing at the same time. This means that no matter how many threads you spawn, and how many cores you have at your disposal, MRI will literally never be executing Ruby code in multiple threads concurrently. Note that this is not the case for JRuby or Rubinius which do not have a GIL and offer true multi-threading.
My understanding is limited here but as I see it the existance of the GIL provides some guarantees and removes certain issues around concurrency within MRI. It’s important to note however that even with the GIL it’s very possible to write code which isn’t threadsafe in MRI.
So when does using threads make sense?
To ensure all citizens are treated fairly the underlying operating system handles context switching between threads, i.e. when to pause execution of one thread and start or resume execution of another thread. We said above that the GIL prevents multiple Ruby threads within a process from executing concurrently, but a typical Ruby program will spend a significant amount of time not executing Ruby code, specifically waiting on blocking I/O. This could be waiting for HTTP requests, database queries or file system operations to complete. While waiting for these operations to complete, the GIL will allow another thread to execute. We can take advantage of this and perform other work, which doesn’t depend on the result, while we wait for the I/O to finish.
Some examples of Ruby projects you may already know which make use of threads for performance reasons are the Puma web server and Bundler.
An Example: Performing HTTP requests concurrently
At thoughtbot we have a simple search service which takes a search term and searches across a few other internal services and presents the results in a segmented way. The searching is performed via API calls over HTTPS to the various services. The results from one service don’t in any way affect the results from another service, but we were performing the searches serially, waiting for each one to complete before starting the next. This meant that, at a minimum, our response times would be the combined API response times from the different services plus time spent handling the incoming request, stitching together the results and returning a response to the end user.
This seemed like a great case for multi-threading, performing the various HTTP requests concurrently. We still need them all to complete before returning the results to the user but our wait time should now become the length of time taken for the slowest API request to complete.
At a high level this should get our timeline waiting for API requests from looking like this:
to this:
The API request dispatching happens in this gather_search_results method:
def gather_result_sets
  search_services.map do |name, search|
    ResultSet.new(name, search.results)
  end
end
This iterates over each of our search services, calling search.results (which
is where the API request happens) and the method returns a list of ResultSet
instances, a thin wrapper around the search results.
It’s a pretty small change to spawn a new Thread for each search:
def gather_result_sets
  search_services.map do |name, search|
    Thread.new { ResultSet.new(name, search.results) }
  end.map(&:value)
end
Calling Thread.new with a block creates a new thread separate from the main
thread’s execution and executes the passed block in that new thread. (That’s
right, in Ruby we’re always executing in the context of a thread. Generally,
unless we’re creating our own child threads we’re in the context of the main
thread, which we can access using Thread.main. We can also access the current
context using Thread.current.)
Our first map returns a list of Thread instances, which we then map over
calling #value on each. The #value method does two things. First it
causes the main thread to wait for our child thread to complete using
#join. Creating threads without calling join on them will cause the main
thread to continue without waiting and possibly exit before the child threads
have finished, killing them in the process. Secondly #value returns the
result (return value) of the block, or raises the exception (if any) which
caused the thread to terminate.
Benchmarking
I ran a benchmark to compare the single threaded version to the multi-threaded version. This shows a good speed up (comparing the “real” column):
                      user     system      total        real
single threaded   0.050000   0.000000   0.050000 (  2.733866)
multi threaded    0.050000   0.010000   0.060000 (  1.193998)
As we predicted before, the overall time of the multi-threaded version correlates with the time taken for the slowest of the underlying services to return.
Summing Up
I hope this example has illustrated that multithreading in Ruby MRI using the
Thread class can be a simple and effective way of improving the performance
of I/O bound code in some scenarios.
The example we looked at was ideal in that the work of each thread was completely independent. This is definitely a scenario that I’ve come across a bunch of times, and for me the trade off between added complexity and the performance benefits are worth it.
Because the work of each thread was independent we didn’t have to worry about
synchronizing anything across threads. Sometimes this does become necessary, for
example when different threads must access a shared resource, and we have to
use tools such as the Mutex class to ensure that a context switch doesn’t
happen while executing a given block of code. This is where the complexity (and
fun!) of writing multi-threaded code can increase.
