---
title: Redis Set Intersection - Using Sets to Filter Data
teaser:
tags: web,redis
author: Harlow Ward
published_on: 2013-03-26
---

Redis is a fantastic lightweight key-value store. It also acts as a data structure
server for storing data types like lists, sets, and hashes.

With [Redis sets][5] we can store records in groups based on time. The
[sunion][3] operator then enables us to group multiple sets together based on a
particular time frame.

## Example application

Search flights based on a departure city, arrival city, and the date/time
of departure.

![Redis set intersection][9]

|Id|Departure Time|Airline|Departure City|Arrival City|
|1|03/23/2013 1:30 pm|Virgin America|Los Angeles|New York|
|2|03/23/2013 1:45 pm|Virgin America|San Francisco|New York|
|3|03/23/2013 1:45 pm|American Airlines|San Francisco|New York|
|4|03/23/2013 1:50 pm|American Airlines|Los Angeles|Boston|
|5|03/23/2013 2:45 pm|Southwest|San Francisco|New York|
|6|03/23/2013 3:30 pm|Southwest|San Francisco|New York|

The flight data is stored as strings in Redis using key-value pairs with the
following key structure:

    get flights:1
    "1, Virgin America, 03/23/2013 1:30 pm, San Francisco, San Diego"
    get flights:2
    "2, Virgin America, 03/23/2013 1:45 pm, San Francisco, New York"
    ...
    get flights:6
    "6, Southwest, 03/23/2013 3:30 pm, San Francisco, New York"

## Group flights by departure time

To group flights by departure time we'll create a Redis set for each hour of the
day, and flights will be added to the appropriate set based on its epoch
departure time.

    smembers departure_time:1364068800:flights
    1) "1"
    2) "2"
    3) "3"
    4) "4"
    smembers departure_time:1364072400:flights
    1) "5"
    smembers departure_time:1364076000:flights
    1) "6"

## Group flights by city

Flights will also be grouped into sets based on their departure and arrival cities.
The city names have been [parameterized][6] to keep a consistent key structure.

Departing flights:

    smembers cities:los-angeles:departures
    1) "1"
    2) "4"
    smembers cities:san-francisco:departures
    1) "2"
    2) "3"
    3) "5"
    4) "6"

Arriving flights:

    smembers cities:boston:arrivals
    1) "4"
    smembers cities:new-york:arrivals
    1) "1"
    2) "2"
    3) "3"
    4) "5"
    5) "6"

## Find flights from San Francisco to New York

To find all flights departing from San Francisco and arriving in New York we can
use [sinter][7] to get the intersection of two sets.

    sinter cities:san-francisco:departures cities:new-york:arrivals
    1) "2"
    2) "3"
    3) "5"
    4) "6"

However, this technique does not allow us to filter flights over a range in time.

To find flights departing on 03/23/2013 between 1:00 pm and 3:00 pm, we'll need
to group the sets of departure times, and then perform an intersection with
the resulting set.

## Redis set intersection

![Redis set intersection][2]

Here are the steps we'll take:

1. Create a new `temp_set` using [sunionstore][4] to group flights by departure time.
1. Intersect the temporary set with the departure and arrival sets.
1. Loop over the results of the intersection and generate an array of flight keys.
1. Use [mget][8] to fetch all the flight data.

We'll use a combination of Ruby and Redis to achieve this:

    # flight_finder.rb
    class FlightFinder
      def initialize(attrs)
        @arrival_city = attrs.fetch(:arrival_city).parameterize
        @departure_city = attrs.fetch(:departure_city).parameterize
        @window = attrs.fetch(:window)
      end

      def flights
        redis.mget(*flight_keys)
      end

      private

      attr_reader :arrival_city, :departure_city, :window

      def flight_keys
        flight_ids.map { |id| "flights:#{id}" }
      end

      def flight_ids
        redis.multi do
          redis.sunionstore('temp_set', *departure_time_keys)
          redis.sinter('temp_set', departure_cities_key, arrival_cities_key)
        end.last
      end

      def departure_cities_key
        "cities:#{departure_city}:departures"
      end

      def arrival_cities_key
        "cities:#{arrival_city}:arrivals"
      end

      def departure_time_keys
        window.range_keys { |epoch_time| "departure_time:#{epoch_time}:flights" }
      end

      def redis
        Redis.current
      end
    end

We'll introduce the concept of a `window` to avoid a long parameter list,
and group parameters that naturally fit together. This allows us to encapsulate
behavior between the `start_at` and `end_at` attributes.

    # window.rb
    class Window
      ONE_HOUR_IN_SECONDS = 3600

      def initialize(start_at, end_at)
        @start_at = convert_to_epoch(start_at)
        @end_at = convert_to_epoch(end_at)
      end

      def range_keys
        (start_at...end_at).step(hour).map do |epoch_time|
          yield(epoch_time)
        end
      end

      private

      attr_reader :start_at, :end_at

      def convert_to_epoch(string_or_time)
        DateTime.parse(string_or_time).to_time.to_i
      end

      def hour
        ONE_HOUR_IN_SECONDS
      end
    end

Create a command line script.

    # flights.rb
    require 'flight_finder'
    require 'window'

    window = Window.new(
      start_at: 'March 23, 2013 1:00 pm',
      end_at: 'March 23, 2013 3:00 pm'
    )

    flight_finder = FlightFinder.new(
      departure_city: 'San Francisco',
      arrival_city: 'New York',
      window: window
    )

    puts flight_finder.flights

Run the command line script.

    $ ruby flights.rb
    2, Virgin America, 03/23/2013 1:45 pm, San Francisco, New York
    3, American Airlines, 03/23/2013 1:45 pm, San Francisco, New York
    5, Southwest, 03/23/2013 2:45 pm, San Francisco, New York

## Takeaways

* Redis set intersection is a powerful filtering technique.
* Use [sunionstore][4] to create new sets on the fly.
* Redis data structures are fantastic for slicing and dicing data.

### Next Steps & Related Reading

[Redis partial word match — you (auto)complete me][11]

[Redis Pub/Sub…how does it work?][10]

[2]: http://img9.imageshack.us/img9/6523/redisintersectiongraph.png
[3]: http://redis.io/commands/sunion
[4]: http://redis.io/commands/sunionstore
[5]: http://redis.io/commands#set
[6]: http://apidock.com/rails/ActiveSupport/Inflector/parameterize
[7]: http://redis.io/commands/sinter
[8]: http://redis.io/commands/mget
[9]: http://img822.imageshack.us/img822/2518/airlinearrivalsdepartur.jpg
[10]: https://thoughtbot.com/blog/post/6325247416/redis-pub-sub-how-does-it-work
[11]: https://thoughtbot.com/blog/post/48851498400/redis-partial-word-match-you-auto-complete-me
