I’ve run into a couple situations recently where I’ve wanted to kick off a long-running process in response to some action (for example, an API call). A great strategy for this is to pair GenServers with a DynamicSupervisor.
For example, let’s say we have the following GenServer which we want to start up on demand. For simplicity, its only job is to print out a log statement and exit, but you can imagine more complicated business logic in here.
defmodule Greeter do
use GenServer, restart: :transient
def start_link(person),
do: GenServer.start_link(__MODULE__, person, name: __MODULE__)
def init(person),
do: {:ok, person, {:continue, nil}}
def handle_continue(nil, person) do
IO.puts("👋 #{person}")
{:stop, :normal, person}
end
end
If the restart: :transient
part is unfamilar to you, it’s just how we say
that it’s okay for this process to terminate when it’s done. The default value
for this is :permanent
, which says that the process should always be brought
back up if it terminates. This is fine for a long-running process, but since
the job of our GenServer is to perform some work and then exit, we need to
modify it. You can read more about the restart option in the Supervisor docs.
Also, as total aside, if you’ve not seen handle_continue/2
before, you should
check it out—it’s pretty cool!
To complete our little example, here’s a DynamicSupervisor we can use to start up our dynamic children:
defmodule GreeterSupervisor do
use DynamicSupervisor
def start_link(arg),
do: DynamicSupervisor.start_link(__MODULE__, arg, name: __MODULE__)
def init(_arg),
do: DynamicSupervisor.init(strategy: :one_for_one)
def greet(person),
do: DynamicSupervisor.start_child(__MODULE__, {Greeter, person})
end
You may have noticed that both our GenServer and DynamicSupervisor give their
names as __MODULE__
. In the case of the supervisor, this is fine, since the
task we’re trying to accomplish just requires a single supervisor. But since
our goal is to start up many instances of our GenServer, we’ll need to give that
a unique name. Let’s update our example to do this.
defmodule Greeter do
# ...
def start_link(person),
do: GenServer.start_link(__MODULE__, person, name: process_name(person))
defp process_name(person),
do: String.to_atom("greeter_for_#{person}")
# ...
end
Notice here that Elixir compels us to convert our process’ name into an atom.
If we didn’t do this, we’d get an ArgumentError
when starting up the child.
This dynamic creation of atoms is problematic, however, because Erlang—and by extension, Elixir—has a hard upper limit on the number of unique atoms an application can allocate. Once that limit is reached, the Erlang VM will crash. Moreover, atoms are never garbage collected, meaning that every new atom created will stick around for the entire lifetime of the application.
The code String.to_atom("greeter_for_#{person}")
is tricky, then, because it
allocates a new atom for each person we greet. The default size of the atom
table is 1,048,576, meaning we could greet about a million unique people before
our app crashed. That may sound like a lot, but because we’re sharing the atom
table with all the other things our app is doing, we might hit the limit faster
than you think.
Happily, Elixir provides us with a mechanism to handle this situation. It’s called a Registry. A registry is basically a way to map a process’ name to its underlying process ID (PID). (Registries have a couple other uses, too, but we won’t get into those here.)
The internal registry which Elixir normally uses to convert a process’ name into a PID intentionally requires that process names be atoms. This is because over the years Erlang has optimized atoms to be extremely fast when used as lookup keys. (One of these optimizations, storing atoms on the heap, is the reason behind the global atom limit to begin with.)
If we use the Registry module to run our own process name registry, however, we’re free to support any type of process name we like. This does come at a small efficiency cost, of course—what doesn’t?—but this is likely a very small part of the overall work performed by your GenServer, so it’s not worth worrying about too much.
Running your own registry is actually quite simple, and doesn’t even require
defining a module. The only thing you have to do is add one line to the
start/2
function of your Application module:
def start(_type, _args) do
# ...
children = [
# ...
{Registry, keys: :unique, name: GreeterRegistry}
]
# ...
end
The keys: :unique
part just says that we want process names to be unique, and
the name is just some way for us to identify the registry. We might want to run
other registries with different purposes down the line, so it’s a good practice
to give registries names indicative of their use case.
We then have to instruct our GenServer to use our special new registry instead
of the default one. For us this just means changing our process_name/1
function.
defmodule Greeter do
# ...
def start_link(person),
do: GenServer.start_link(__MODULE__, person, name: process_name(person))
defp process_name(person),
do: {:via, Registry, {GreeterRegistry, "greeter_for_#{person}"}}
# ...
end
Note how we’ve been able to remove the String.to_atom/1
call. Nice!
Dealing with this “via tuple” (as its called) is a little more cumbersome than
a simple atom, but if you define something like our process_name/1
function,
it’s not too bad. And, more importantly, your app won’t crash because it ran
out of atoms, which is also pretty cool.
So that’s it! For the super curious, this answer on an Erlang mailing list goes a bit more in depth as to the design and performance considerations that went into giving an Erlang an atom limit to begin with. Thanks for reading along!