Elixir for Rubyists

Tute Costa

When I started my first project in Elixir I was still thinking in Ruby. At first, this was not much of a setback: both Ruby and Elixir share similar syntax and are high-level, readable, fun programming languages. The fast-growing, supportive community is reminiscent of Ruby during the early days of Rails.

Some features in Ruby map directly to features in Elixir:

Ruby Elixir
irb iex
rake tasks mix (built in)
bundler dependencies mix (built in)
binding.pry IEx.pry (built in)
Polymorphism Protocols
Lazy Enumerables Streams
Metaprogramming Macros (used sparingly)
Rails Phoenix

Some aspects look different at a first glance. For example, Elixir code looks a bit more verbose than Ruby code. Module names are spelled out in most function calls. Modules being used in the current file are explicitly included. State is passed into functions as arguments. Before explaining how macros can extend the language, its documentation page explicitly discourages its use.

Elixir can indeed be thought of as Ruby for the Erlang VM, and this first approach is sufficient for small projects. While Elixir resembles Ruby, there are notable differences.

Elixir is Erlang in Ruby’s clothing

Elixir programs compile to the Erlang VM (BEAM). Erlang was developed at Ericsson in 1986, has been Open Source since 1998, and is maintained by the Open Telecom Platform (OTP) unit at Ericsson.

Performance

The WhatsApp development team was able to establish two million TCP connections on a single box using Erlang alone. How is it so performant?

The Erlang VM runs as one operating system process, and by default runs one OS thread per core. Elixir programs use all CPU cores.

Erlang processes have no connection to OS processes or threads. Erlang processes are lightweight (grow and shrink dynamically) with small memory footprint, fast to create and terminate, and the scheduling overhead is low. An Erlang system running over one million (Erlang) processes may run one operating system process. Erlang processes share no state with each other, and communicate through asynchronous messages. This makes it the first popular actor-based concurrency implementation.

If process is waiting for a message (stuck in receive operator) it will never be queued for execution until a message is found. This is why millions of mostly idle processes are able to run on a single machine without reaching high CPU usage.

Erlang’s garbage collector works under certain assumptions that help its efficiency. Every variable is immutable, so once a variable is created, the value it points to never changes. Values are copied between processes, so memory referenced in a process is (almost always) isolated. And the garbage collector runs per process, which are relatively small. See section 4 of Programming the Parallel World for a detailed overview of Erlang processes and garbage collection.

In short, Erlang is faster than other platforms because it doesn’t allow mutability or shared memory by default.

Fault tolerance

Erlang’s architecture is share-nothing: each node is independent from each other and self-sufficient. Software in Erlang can be architected in such a way that there is no single point of failure and allows for non-disruptive upgrades.

Paraphrasing José Valim at The Changelog podcast, fault-tolerance in Erlang means “keep the system running”: it’s ok to maybe drop a user’s phone call, but it’s not ok to drop everyone’s phone call.

Unlike the web, telecommunication companies can’t call everyone and say there will be an outage between 6 and 6:30am. They have to keep the lights always on.

Pure functions

Erlang is a functional programming language. Functional programming treats computation as the evaluation of functions, and avoids mutable state and data. A pure function shows referential transparency: calls to the function can be replaced by their return values without changing the semantics of the program. We want pure functions because they are deterministic, easier to test, and easier to reason about.

Erlang’s immutable data structures and single assignment variables help with writing pure functions. Researchers classified on average between 30% and 50% of pure functions while analyzing different sizeable Erlang code bases. It’s possible to have side effects too: we can dynamically load and unload code, and change the operating system’s environment variables.

Ruby doesn’t enforce isolated state. Values can be mutated in place (string.gsub!). A getter is a canonical example of an impure function: each time it’s called with same arguments it might return different results according to the object’s internal state.

Other Erlang features available in Elixir

  • Distributed
  • Soft real-time (think telecommunications’ quality of service)
  • Highly available
  • Hot code swapping

Elixir on its own merits

The Pipe Operator

Let’s say we want to convert blog post titles into permalinks. For a title like ExMachina Hits 1.0 - More Flexible, More Useful and Support for Ecto 2.0 we expect a permalink like exmachina-hits-1-0-more-flexible-more-useful-and-support-for-ecto-2-0.

A Ruby implementation is:

title.
  downcase.
  gsub(/\W/, " "). # convert non-word chars (like -,!) into spaces
  split.    # drop extra whitespace
  join("-") # join words with dashes

Each call returns a String object or an object that implements Enumerable, and so we can chain String or Enumerable methods on each result. We’d like to give names to each step, so that the code comments are not necessary. How can we make the code look like the following?

text.
  downcase.
  replace_non_words_with_spaces.
  drop_extra_whitespace.
  join_with_dashes

To make this work we’d need to monkey patch the String class to define replace_non_words_with_spaces and drop_extra_whitespace, and Enumerable to define join_with_dashes.

Now let’s think of the same feature in Elixir. In a functional fashion, the code might look like:

Enum.join(
  String.split(
    String.replace(
      String.downcase(title), ~r/\W/, " "
    )
  ),
  "-"
)

We can make this more readable by using local variables:

downcased = String.downcase(title)
non_words_to_spaces = String.replace(downcased, ~r/\W/, " ")
whitespace_to_dashes = Enum.join(String.split(non_words_to_spaces), "-")

But Elixir provides a more idiomatic approach. The pipe operator introduces the expression on the left-hand side as the first argument to the function call on the right-hand side. It allows us to write code in the following shape:

title
|> String.downcase
|> String.replace(~r/\W/, " ")
|> String.split
|> Enum.join("-")

Each step can be extracted into an intention-revealing function:

title
|> downcased
|> non_words_to_spaces
|> whitespace_to_dashes

There are two implementations of function composition in Ruby similar to the pipe operator. Fabio Akita created the chainable_methods RubyGem that wraps a value into an object that implements pipe operator behavior. To get the final result one needs to call unwrap as last method call. Mike Burns authored the method_missing RubyGem that allows chaining functions with a * operator.

=, the match operator

In Elixir, the = operator is the match operator. It enables us to assign and then match values. Let’s take a look:

iex> {a, b, 42} = {:hello, "world", 42}
{:hello, "world", 42}
iex> a
:hello
iex> b
"world"
iex> :hello = a
:hello
iex> "world" = a
** (MatchError) no match of right hand side value: :hello

Pattern Matching and Multiple Function Clauses

Pattern matching allows to match simple values, data structures, and even functions. Function declarations support guards and multiple clauses. If a function has several clauses, Elixir will try each clause until it finds one that matches.

Let’s see an example. In this template we want to add an active class name to an li element only if the currently assigned document is the same as the one we are currently iterating over. The template follows (@conn.assigns looks like a hash of hashes):

<%= for document <- documents do %>
  <li class="<%= active_class(@conn.assigns, document.id) %>">
    <%= document.title %>
  </li>
<% end %>

A typical active_class implementation would include a conditional that checks if the argument is contained in the list (in Ruby, enumerable.include?(item)). Our Elixir implementation reads:

def active_class(%{document: %{id: id}}, id), do: "active"
def active_class(_, _), do: ""

In this implementation, we match the hash for a key called document, and if there is one, we match over its id. The function clause defines that same id as second argument, and if the call matches, the "active" string will be returned. The second clause matches any other case, for which we return the empty string.

Pattern matching spares us from conventional control-flow structures. Programs are composed of many small functions clauses, with guard clauses or pattern matching to trigger the right behavior according to how the arguments match. Code ends up being more declarative than imperative.

Macros

Macros for metaprogramming is another feature that Elixir adds on top of Erlang. This is a topic for another blog post.

When is Elixir a better fit than Ruby?

I believe Elixir and Ruby are interchangeable for simple web applications with no high-traffic or that don’t require very short response times.

Some people prefer the compile time checks that we don’t get in Ruby. Following TDD and keeping high code quality I don’t see meaningful differences in the cost of maintaining a small or medium project depending on the language on which it was developed.

For some applications though, Elixir makes better technical sense:

Scale

Elixir makes scaling easier than Ruby. Phoenix’ 10x throughput of a comparable Rails application means that you would need to add caching or hosts to your Elixir deployment later than in a Rails deployment.

Elixir processes are identified by a node and its process id. Sending processes to other hosts is the same as spawning them in the current host. In any case there is no shared state between processes or hosts, which is a problem one has to consciously take care of in Ruby.

Supervision trees and the ability to configure how to recover from failure when processes die are built-in features that aid with horizontal scaling.

High-availability Systems

Fault tolerance and hot code swapping are two features that Elixir counts with and help with the deployment of highly-available systems.


We are happily and productively building projects in Elixir. To read more of what we are learning, check out our elixir posts.