Enumerators are a powerful Ruby feature. Did you know that you can create custom enumerators with custom logic? Today, I will tell you how we fixed a production problem with a custom enumerator.
Problem
We had a method like the following:
def customer_property_ids
sql = "SELECT DISTINCT PropertyId from AddressMatch"
client.execute(sql).map { _1["PropertyId"] }
end
It runs a Microsoft SQL Server query that returns an array of property
IDs. In Ruby, arrays implement the Enumerable interface by default,
so we can use the rich assortment of Enumerable
methods to iterate
over them.
Other methods from the same class relied on the array returned by
customer_property_ids
; we had three occurrences of the following
call throughout the class:
customer_property_ids.each_slice(BATCH_SIZE) do |batch_ids|
# ...
end
We were able to pinpoint the cause of the crash: the SQL Server query returned more than 5 million property IDs, which made the Ruby process core dump due to memory exhaustion.
We needed to push a quick fix to production to get the application back up while limiting the scope of the changes for maximum safety. Getting back to business was a matter of the utmost urgency. What to do?
Solution
We would avoid the core dump if we kept fewer IDs in memory at a time, and that’s where our custom enumerator comes in. The first thing we did was figure out how to paginate a query in SQL Server. Here’s how we do it:
SELECT DISTINCT PropertyId
FROM AddressMatch
ORDER BY PropertyId ASC
OFFSET 0 ROWS FETCH FIRST 100000 ROWS ONLY
This query retrieves the first 100,000 IDs. We can interpolate an
OFFSET
and FETCH FIRST
into the SQL string to get the subsequent
pages.
Finally, we limited the changes to a single spot: the
customer_property_ids
method. Putting everything together, the
enumerator code looked as follows:
def customer_property_ids
Enumerator.new do |yielder|
limit = 100_000
offset = 0
loop do
sql = <<~SQL
SELECT DISTINCT PropertyId
FROM AddressMatch
ORDER BY PropertyId ASC
OFFSET #{offset} ROWS FETCH FIRST #{limit} ROWS ONLY
SQL
ids = client.execute(sql)
.map { |row| yielder << row["PropertyId"] }
break if ids.none?
offset += limit
end
end
end
In the enumerator, we manage a limit
and an offset
and execute the
query within a loop until it returns an empty result. Each time the
query returns the results for a page, we give the property IDs to the
yielder
.
With that scoped change, we avoided further changes to the methods
that called customer_property_ids.each_slice
. We built our own
custom enumerator that paginates the query results and avoids memory
bloat! However, pagination is an internal detail encapsulated by the
enumerator.
Explanation
Let’s explain the solution from a mental model perspective, which doesn’t necessarily correspond to the internal Ruby implementation details.
The magic of enumerators is that they use Fibers underneath, so in
our example, whenever we shove a value into the yielder
, it pauses
execution and returns control to the each_slice
method. If
each_slice
does not have enough values to work with, it resumes the
execution and asks for more. Here’s a brief example:
enum = Enumerator.new do |yielder|
puts "Starting the custom enumerator..."
[1, 2].each do |n|
puts "Giving #{n} to the yielder"
yielder << n
end
puts "each_slice is still asking for more values..."
[3, 4].each do |n|
puts "Giving #{n} to the yielder"
yielder << n
end
end
enum.each_slice(3) do |slice|
puts "We have enough, let's take a slice: #{slice}"
end
And the output:
Starting the custom enumerator...
Giving 1 to the yielder...
Giving 2 to the yielder...
each_slice is still asking for more values...
Giving 3 to the yielder
We have enough, let's take a slice: [1, 2, 3]
Giving 4 to the yielder
We have enough, let's take a slice: [4]
Back to the customer_property_ids
enumerator, that continuous
context switching guarantees that the values are processed as the
yielder gets fed with them, which allows the garbage collector to
discard the results already used. Compare that to eager loading all
the 5 million IDs in memory!
While working on that fix, we discovered that the code was not optimal and could benefit from other refactors to make it scale better, but that simple change quickly put our client back in business.