---
title: IO in Ruby
teaser: Learn about Ruby's beautiful, duck-typed approach to Input/Output.
tags: web,ruby,unix
author: Joël Quenneville
published_on: 2014-10-01
---

Input/Output, generally referred to as I/O, is a term that covers the ways that
a computer interacts with the world. Screens, keyboards, files, and networks are
all forms of I/O. Data from these devices is sent to and from programs as a
stream of characters/bytes.

Unix-like systems treat all external devices as files. We can see these under
the `/dev` directory.
Read [this
list](http://docstore.mik.ua/orelly/unix3/mac/appa_01.htm#mosxgeeks-APP-A-TABLE-6)
for a quick description of all the devices we might find under `/dev` for OS X.

For example (truncated for brevity):

```shell
$ tree /dev
/dev
├── disk0
├── fd
│   ├── 0
│   ├── 1
│   ├── 2
│   └── 3 [error opening dir]
├── null
├── stderr -> fd/2
├── stdin -> fd/0
├── stdout -> fd/1
├── tty
└── zero
```

I/O streams are located under the `/dev/fd` directory. Files there are given a
number, known as a file descriptor. The operating system provides three streams
by default. They are:

* Standard input (`/dev/fd/0`)
* Standard output (`/dev/fd/1`)
* Standard error (`/dev/fd/2`)

They are often abbreviated to stdin, stdout, and stderr respectively. Standard
input will default to reading from the keyboard while standard output and
standard error both default to writing to the terminal. As can be seen above,
`/dev/stdout`, `/dev/stdin`, and `/dev/stderr` are just symlinks to the
appropriate file descriptor.

## The `IO` class

Ruby `IO` objects wrap Input/Output streams. The constants `STDIN`, `STDOUT`, and
`STDERR` point to `IO` objects wrapping the standard streams. By default the
global variables `$stdin`, `$stdout`, and `$stderr` point to their respective
constants. While the constants should always point to the default streams, the
globals can be overwritten to point to another I/O stream such as a file. `IO`
objects can be written to via `puts` and `print`.

```ruby
$stdout.puts 'Hello World'
```

We've all written the shorthand version of this program:

```ruby
puts 'Hello World'
```

The bare `puts` method is provided by ruby's `Kernel` module that is just an
alias to `$stdout.puts`. Similarly, `IO` objects can be read from via `gets`.
The bare `gets` provided by `Kernel` is an alias to `$stdin.gets`

`$stdin` is read-only while `$stdout` and `$stderr` are write-only.

```ruby
[1] pry(main)> $stdin.puts 'foo'
IOError: not opened for writing
[2] pry(main)> $stdout.gets
IOError: not opened for reading
[3] pry(main)> $stderr.gets
IOError: not opened for reading
```

To create a new `IO` object, we need a file descriptor.
In this case, 1 (stdout).

```ruby
[1] pry(main)> io = IO.new(1)
=> #<IO:fd 1>
[2] pry(main)> io.puts 'hello world'
hello world
=> nil
```

What about creating IOs to other streams? They don't have constant file
descriptors so we first need to get that via `IO.sysopen`.

```ruby
[1] pry(main)> fd = IO.sysopen('/dev/null', 'w+')
=> 8
[2] pry(main)> dev_null = IO.new(fd)
=> #<IO:fd 8>
[3] pry(main)> dev_null.puts 'hello'
=> nil
[4] pry(main)> dev_null.gets
=> nil
[5] pry(main)> dev_null.close
=> nil
```

 `/dev/null` (sometimes referred to as the "bit bucket" or "black hole") is the
null device on Unix-like systems. Writing to it does nothing and attempting to
read from it returns nothing (`nil` in Ruby)

First, we get a file descriptor for a stream that that is read/write to the
`dev/null` device. Then we create an `IO` object for this stream so we can
interact with it in Ruby. When writing to `dev_null`, the text no longer appears
on the screen. When reading from `dev_null`, we get `nil`.

Since everything on a Unix-like system is a file, we can open an `IO` stream to
a text file in the same way we would open a device. We just create a file
descriptor with the path to our file and then create an `IO` object for that
file descriptor. When we are done with it, we close the stream to flush Ruby's
buffer and release the file descriptor back to the operating system. Attempting
read or write from a closed stream will raise an `IOError`.

## Position

When working with an `IO`, we have to keep position in mind. Given that we've
opened a stream to the following file:

    Lorem ipsum
    dolor
    sit amet...

and we call `gets` on it:

```ruby
[1] pry(main)> IO.sysopen '/Users/joelquenneville/Desktop/lorem.txt'
=> 8
[2] pry(main)> lorem = IO.new(8)
=> #<IO:fd 8>
[3] pry(main)> lorem.gets
=> "Lorem ipsum\n"
```

it returns the first line of the file and moves the cursor to the next line. If
we check the position of the cursor:

```ruby
[4] pry(main)> lorem.pos
=> 12
```

If we call `gets` a few more times:

```ruby
[5] pry(main)> lorem.gets
=> "dolor\n"
[6] pry(main)> lorem.gets
=> "sit amet...\n"
[7] pry(main)> lorem.pos
=> 30
```

we can see ruby's "cursor" has moved. Now that we have read the whole file, what
happens if we try to call `gets`?

```ruby
[8] pry(main)> lorem.gets
=> nil
[9] pry(main)> lorem.eof?
=> true
```

We see that it returns `nil`. We can ask a stream if we have reached "end of
file" via `eof?`. To return to the beginning of the stream, we can call
`rewind`.

```ruby
[10] pry(main)> lorem.rewind
=> 0
[11] pry(main)> lorem.pos
=> 0
```

This can lead to surprises when writing to a stream.

```ruby
[1] pry(main)> fd = IO.sysopen '/Users/joelquenneville/Desktop/test.txt', 'w+'
=> 8
[2] pry(main)> io = IO.new(fd)
=> #<IO:fd 8>
[3] pry(main)> io.puts 'hello world'
=> nil
[4] pry(main)> io.puts 'goodbye world'
=> nil
```

This stream has the lines "hello world" and "goodbye world".
If we were to attempt to read:

```ruby
[5] pry(main)> io.gets
=> nil
[6] pry(main)> io.eof?
=> true
```

Our cursor is currently at the end of the file. In order to read we would need
to first rewind.

```ruby
[7] pry(main)> io.rewind
=> 0
[8] pry(main)> io.gets
=> "hello world\n"
```

Any write operations in the middle of a stream will overwrite the existing data:

```ruby
[9] pry(main)> io.pos
=> 12
[10] pry(main)> io.puts "middle"
=> nil
[11] pry(main)> io.rewind
=> 0
[12] pry(main)> io.read
=> "hello world\nmiddle\n world\n"
```

This kind of behavior is necessary because streams do not get loaded into
memory. Instead, only the lines being operated on are loaded. This is very
useful because some streams can point to very large files that would be
expensive to load in memory all at once. Streams can also be infinite. For
example, `$stdin` has no end. We can always read more data from it (when it
receive the message `gets`, it waits for the user to type something).

## Sub-classes and Duck-types

Ruby gives us a couple subclasses of `IO` that are more specialized for a
particular type of IO:

### File

[`File` docs](http://www.ruby-doc.org/core-2.1.2/File.html)

Probably the most well known `IO` subclass. `File` allows us to read/write files
without messing around with file descriptors. It also adds file-specific
convenience methods such as `File#size`, `File#chmod`, and `File.path`.

### The Sockets

Socket docs:

* [`TCPSocket`](http://www.ruby-doc.org/stdlib-2.1.2/libdoc/socket/rdoc/TCPSocket.html)
* [`UDPSocket`](http://www.ruby-doc.org/stdlib-2.1.2/libdoc/socket/rdoc/UDPSocket.html)
* [`UNIXSocket`](http://www.ruby-doc.org/stdlib-2.1.2/libdoc/socket/rdoc/UNIXSocket.html)
* [`Socket`](http://www.ruby-doc.org/stdlib-2.1.2/libdoc/socket/rdoc/Socket.html)

Ruby's various socket classes inherit all ultimately inherit from `IO`.

For example, I have a server running on `localhost:3000`

```ruby
[1] pry(main)> require 'socket'
=> true
[2] pry(main)> socket = TCPSocket.new 'localhost', 3000
=> #<TCPSocket:fd 10>
[3] pry(main)> socket.puts 'GET "/"'
=> nil
[4] pry(main)> socket.gets
=> "HTTP/1.1 400 Bad Request \r\n"
```

### StringIO

[`StringIO` docs](http://www.ruby-doc.org/stdlib-2.1.2/libdoc/stringio/rdoc/StringIO.html)

`StringIO` allows strings to behave like `IO`s. This is useful when we want to
pass strings into systems that consume streams. This is common in tests where
we might inject a `StringIO` instead of reading an actual file from disk.
Unlike previous classes showcased, `StringIO` does not inherit from `IO`.

```ruby
[1] pry(main)> string_io = StringIO.new('hello world')
=> #<StringIO:0x007feacb0cd4e8>
[2] pry(main)> string_io.gets
=> "hello world"
[3] pry(main)> string_io.puts 'goodby world'
=> nil
[4] pry(main)> string_io.rewind
=> 0
[5] pry(main)> string_io.read
=> "hello worldgoodby world\n"
```

### Tempfile

[`Tempfile` docs](http://www.ruby-doc.org/stdlib-2.1.2/libdoc/tempfile/rdoc/Tempfile.html)

`Tempfile` is another class that doesn't inherit from `IO`. Instead, it
implements `File`'s interface and deals with temporary files. As such, it can be
passed to any object that consumes `IO`-like objects.

## Putting it all together

Say we have the following class for some command-line program:

```ruby
class SystemTask
  def execute
    puts "preparing to execute"

    puts "starting first task"
    first_task

    puts "starting second task"
    second_task

    puts "execution complete"
  end
end
```

Testing this class causes all these messages to be output, cluttering our
results. One approach to solving this problem would be to inject `IO` objects
instead of calling `Kernel#puts` and to pass in a null object in tests.

```ruby
class SystemTask
  def initialize(io=$stdout)
    @io = io
  end

  def execute
    @io.puts "preparing to execute"

    @io.puts "starting first task"
    first_task

    @io.puts "starting second task"
    second_task

    @io.puts "execution complete"
  end
end
```

In production, we can still call `SystemTask.new.execute` as before.
Now we can pass in our own `IO` in tests. This could be a test double, a
`StringIO`, or a stream to `/dev/null`

```ruby
describe SystemTask do
  # test double
  it "executes tasks" do
    io = double("io", puts: nil)
    system_task = SystemTask.new(io)

    system_task.execute

    # expect things to have happened

    # if we care about the messages, we can also expect on the double
    expect(io).to have_received(:puts).with("preparing to execute")
  end

  # StringIO
  it "executes tasks" do
    io = StringIO.new
    system_task = SystemTask.new(io)

    system_task.execute

    # expect things to have happened

    # if we care about the messages read from the string io
    io.rewind
    expect(io.read).to eq "preparing to execute\nstarting first task\nstarting
second task\nexecution complete\n"
  end

  # /dev/null
  it "executes tasks" do
    io = File.open(File::NULL, 'w')
    system_task = SystemTask.new(io)

    system_task.execute

    # expect things to have happened

    # only use /dev/null if we don't care about the messages
  end
end
```

## Working with disparate APIs

While working on a recent project that pulled reports from several <abbr
title="Application Programming Interface">API</abbr>s, we noticed some responses
were strings, others were CSV documents, and others generate the report and then
we had to make a request to another endpoint to download it

The solution was to create an adapter for each <abbr title="Application
Programming Interface">API</abbr> that would get the data and return in a
standard format wrapped in some type of IO-like object. A persistor object could
then process and persist any of the reports as long as they were formatted the
same way and were `IO`-like. For example:

```ruby
class API1Report
  def fetch
    # fetch report (comes down as a CSV doc)
    # process it to get it in a standard format
    # return standardized report as a Tempfile object
  end
end

class API2Report
  def fetch
    # fetch report
    # returns it as a File object
  end
end

class Persistor
  def initialize(report)
    @report = report
  end

  def persist
    # process and persist the report
  end
end
```

## What's next

Read [an overview of 4.4 BSD's
I/O](http://www.freebsd.org/doc/en/books/design-44bsd/overview-io-system.html)
to develop a deeper understanding of Unix I/O, file descriptors, and devices.

Read [the TTY system](http://www.linusakesson.net/programming/tty/) to
understand the relationship between Unix jobs, processes, and I/O with the TTY
device.

Practice Ruby I/O by [cloning this
repo](https://practicingruby.com/articles/study-guide-1?u=dc2ab0f9bb).

Finally, go deeper into Ruby's I/O in [this chapter from Read
Ruby](http://readruby.io/io).
