---
title: Tab completion in GNU Readline
teaser: 'Learn how to use GNU Readline''s powerful tab completion features in your
  own applications.

  '
tags: gitsh,readline,c
author: George Brocklehurst
published_on: 2016-08-08
---

[GNU Readline][readline]
is a powerful [line editor][line-editor]
with support for [fancy editing commands][cheatsheet],
history,
and tab completion.
Even if you're not familiar with the name Readline
you might still be using it:
it's integrated into all kinds of tools
including [GNU Bash][bash],
various language <abbr title="Read-Evaluate-Print Loops">REPLs</abbr>,
and our own
[gitsh][gitsh] project.

This post will talk you through
the more advanced Readline tab completion features gitsh uses
and show you how to use them
in your own programs.

To avoid getting lost in the details of the
gitsh code<sup><a href="#fn1" id="r1" title="Footnote 1">1</a></sup>,
we'll use a simplified [example application][example-repo]
for this post.

[readline]: https://cnswww.cns.cwru.edu/php/chet/readline/rltop.html
[line-editor]: https://en.wikipedia.org/wiki/GNU_Readline
[cheatsheet]: http://readline.kablamo.org/emacs.html
[bash]: https://www.gnu.org/software/bash/
[gitsh]: https://github.com/thoughtbot/gitsh

## Basic tab completion

To get us started,
here's the simplest Readline program I can think of.
It uses Readline to get input from the user,
echoes that input back,
and then exits.

<figure>
<pre><code class="c">#include &lt;stdio.h&gt;
#include &lt;stdlib.h&gt;
#include &lt;readline/readline.h&gt;

int
main(int argc, char *argv[])
{
    char *buffer = readline("> ");
    if (buffer) {
        printf("You entered: %s\n", buffer);
        free(buffer);
    }

    return 0;
}</code></pre>
<figcaption>
  <a href="https://github.com/georgebrock/readline-example/blob/9b8c3e6/main.c">
    <code>main.c</code> at revision 9b8c3e6
  </a>
</figcaption>
</figure>

Hiding among the boiler-plate code
is our first invocation of a GNU Readline function:

```c
char *buffer = readline("> ");
```

The [`readline`][docs-readline] function prompts the user for input,
with all of Readline's power behind it.
This includes tab completion for file system paths.
If you don't want to complete anything more than filenames
you don't need to go any further than this.

### Custom completion options

In gitsh—and many other programs that use Readline—it's
useful to be able to complete things other than paths.
In gitsh,
we're interested in completing things like
Git commands,
branch names,
and remotes.
For the purpose of this example,
let's say we're only interested in completing values from
a fixed list of the names of some characters from
<cite>The Hitchiker's Guide to the Galaxy</cite>.

Here's our expanded program with custom tab completion:

<figure>
<pre><code class="c">#include &lt;stdio.h&gt;
#include &lt;stdlib.h&gt;
#include &lt;string.h&gt;
#include &lt;readline/readline.h&gt;

char **character_name_completion(const char *, int, int);
char *character_name_generator(const char *, int);

char *character_names[] = {
    "Arthur Dent",
    "Ford Prefect",
    "Tricia McMillan",
    "Zaphod Beeblebrox",
    NULL
};

int
main(int argc, char *argv[])
{
    rl_attempted_completion_function = character_name_completion;

    printf("Who's your favourite Hitchiker's Guide character?\n");
    char *buffer = readline("> ");
    if (buffer) {
        printf("You entered: %s\n", buffer);
        free(buffer);
    }

    return 0;
}

char **
character_name_completion(const char *text, int start, int end)
{
    rl_attempted_completion_over = 1;
    return rl_completion_matches(text, character_name_generator);
}

char *
character_name_generator(const char *text, int state)
{
    static int list_index, len;
    char *name;

    if (!state) {
        list_index = 0;
        len = strlen(text);
    }

    while ((name = character_names[list_index++])) {
        if (strncmp(name, text, len) == 0) {
            return strdup(name);
        }
    }

    return NULL;
}</code></pre>
<figcaption>
  <a href="https://github.com/georgebrock/readline-example/blob/ef33b0b/main.c">
    <code>main.c</code> at revision ef33b0b
  </a>
</figcaption>
</figure>

We're making use of three new Readline features here.

First, we set
[`rl_attempted_completion_function`][docs-rl_attempted_completion_function]:

```c
rl_attempted_completion_function = character_name_completion;
```

When the user hits their tab key
Readline will invoke the function we've assigned to
`rl_attempted_completion_function`.
The partial argument we're completing
and the positions where it starts and ends in the current line of input
will be passed as arguments.

If we modify our `character_name_completion` function
to print its arguments,
we'd see something like this:

<figure>
<pre><samp>Who's your favourite Hitchiker's Guide character?</samp>
<samp>&gt; </samp><kbd>I like Arth<kbd title="tab">⇥</kbd></kbd>
<samp>text="Arth", start=7, end=11</samp></pre>
<figcaption>
  Output from
  <a href="https://github.com/georgebrock/readline-example/blob/2c50931/main.c#L35">
    <code>character_name_completion</code>
    modified to print arguments
  </a>
</figcaption>
</figure>

Note that we're only passed `"Arth"`,
and not the whole input.
Given this information,
we need to return the possible completions:

* If there are no possible completions,
  we should return `NULL`.
* If there is one possible completion,
  we should return an array containing
  that completion,
  followed by a `NULL` value.
* If there are two or more possibilities,
  we should return an array containing
  the longest common prefix of all the options,
  followed by each of the possible completions,
  followed by a `NULL` value.

Rather than building this array by hand,
including all of the complexity of finding the longest common prefix,
we can use the helpful
[`rl_completion_matches`][docs-rl_completion_matches] function
with a generator function:

```c
return rl_completion_matches(text, character_name_generator);
```

The generator function—in our case
`character_name_generator`—is called with
the `text` that was passed to `rl_completion_matches`,
and a `state` value that will be zero on the first call
and non-zero on subsequent calls
(we're using the fact that `state` is zero on the first call
to initialise some static variables,
but otherwise ignoring it).

Each time it's called,
`character_name_generator` returns a completion that matches the given text.
When it can't find any more options
it returns `NULL`.

If our `character_name_completion` function
returned no matches
(i.e. `character_name_generator` returned `NULL` on the first call),
Readline's default behaviour would be to
fall back to its default path completion.
In this case
we don't want that to happen,
so we added one more line to `character_name_completion`
to tell it our list of completions is final,
even when it's empty,
by setting [`rl_attempted_completion_over`][docs-rl_attempted_completion_over]
to a non-zero value:

```c
rl_attempted_completion_over = 1;
```

## Quoting and escaping

Our current implementation works well enough
when the user is entering the name of a single character.
But what would happen if they needed to enter a list of characters,
separated by spaces?
How would we know if we were seeing
a space between a character's first name and last name,
or a space between two different characters?

Shells like bash, zsh, and gitsh
get around this with quoting and escaping.

We could <dfn>quote</dfn>
each character's name:

<pre><kbd>"Arthur Dent" "Ford Prefect"</kbd></pre>

Or we could <dfn>escape</dfn> the spaces
that don't indicate the start of a new character's name:

<pre><kbd>Arthur\ Dent Ford\ Prefect</kbd></pre>

Quoting and escaping are important for tab completion.
As we've seen,
Readline passes only the last argument of the user's input
to our completion function.
If we want to support quoting and escaping
we need some way of telling Readline
if the space separating two words
counts as the start of a new argument.
We also need to make sure
that when we complete an argument containing a space
that it is appropriately escaped.

The cases we need to cover are:

<table>
  <thead>
    <tr>
      <th scope="col">Input</th>
      <th scope="col">Expected output</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><kbd>"Arthu<kbd title="tab">⇥</kbd></kbd></td>
      <td><samp>"Arthur Dent"</samp></td>
    </tr>
    <tr>
      <td><kbd>"Arthur D<kbd title="tab">⇥</kbd></kbd></td>
      <td><samp>"Arthur Dent"</samp></td>
    </tr>
    <tr>
      <td><kbd>Arthu<kbd title="tab">⇥</kbd></kbd></td>
      <td><samp>Arthur\ Dent</samp></td>
    </tr>
    <tr>
      <td><kbd>Arthur\ D<kbd title="tab">⇥</kbd></kbd></td>
      <td><samp>Arthur\ Dent</samp></td>
    </tr>
  </tbody>
</table>

### Adding quoting support

Quoting is easier than escaping,
so let's tackle that first.

All we need to do
is tell Readline
which characters our program uses
as delimiters for quoted strings,
by setting
[`rl_completer_quote_characters`][docs-rl_completer_quote_characters]:

<figure>
<pre><code class="c">rl_completer_quote_characters = "\"'";</code></pre>
<figcaption>
  <a href="https://github.com/georgebrock/readline-example/commit/203ce44">
    Changes introduced by revision 203ce44
  </a>
</figcaption>
</figure>

Now,
when we press tab
within a single- or double-quoted string,
Readline will pass everything after the opening quote
to our completion function.

It'll even close the quotes for us
if there's only one possible completion,
or leave them open
if there are several to choose from.

### Adding escaping support

The first thing we need to do
to support escaping
is to make sure that the completion options we return
are properly escaped.

We'd expect unquoted input to produce escaped output,
and quoted input to produce unescaped but quoted output:

<table>
  <thead>
    <tr>
      <th scope="col">Input</th>
      <th scope="col">Expected output</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><kbd>Arthu<kbd title="tab">⇥</kbd></kbd></td>
      <td><samp>Arthur\ Dent</samp></td>
    </tr>
    <tr>
      <td><kbd>"Arthu<kbd title="tab">⇥</kbd></kbd></td>
      <td><samp>"Arthur Dent"</samp></td>
    </tr>
  </tbody>
</table>

Conveniently,
we've already set `rl_completer_quote_characters`,
so Readline is aware of whether or not we are
completing a quoted string.

We can modify our `character_name_generator` function
to read the
[`rl_completion_quote_character`][docs-rl_completion_quote_character]
variable then
produce escaped character names
if we're not completing a quoted argument:

<figure>
<pre><code class="c">char *
character_name_generator(const char *text, int state)
{
    static int list_index, len;
    char *name;

    if (!state) {
        list_index = 0;
        len = strlen(text);
    }

    while ((name = character_names[list_index++])) {
        if (rl_completion_quote_character) {
            name = strdup(name);
        } else {
            name = escape(name);
        }

        if (strncmp(name, text, len) == 0) {
            return name;
        } else {
            free(name);
        }
    }

    return NULL;
}

char *
escape(const char *original)
{
    char *escaped;
    // ...
    return escaped;
}</code></pre>
<figcaption>
  <a href="https://github.com/georgebrock/readline-example/commit/9e18d61">
    Changes introduced by revision 9e18d61
  </a>
</figcaption>
</figure>

The important bit of new functionality here
is that we conditionally escape our options:

```c
if (rl_completion_quote_character) {
    name = strdup(name);
} else {
    name = escape(name);
}
```

If Readline has seen an un-closed quote
it will set `rl_completion_quote_character`
to the appropriate quote character
(in our case `'` or `"`,
since those are the characters we listed in `rl_completer_quote_characters`).
If `rl_completion_quote_character` is zero,
we know we're not completing a quoted argument.

The `escape` function I've written for this example
allocates a new character array on the heap,
so we don't need to use `strdup`
if we've already used `escape`<sup><a href="#fn2" id="r2" title="Footnote 2">2</a></sup>.

I've omitted the full implementation of `escape` here
because it's rather long,
but you can see the
[full example code on GitHub][example-repo].

[example-repo]: https://github.com/georgebrock/readline-example

### Detecting escaped word breaks

This is getting pretty good,
but we're still left with one case we can't handle.
If the user input contains a space
that's escaped:

<table>
  <thead>
    <tr>
      <th scope="col">Input</th>
      <th scope="col">Expected output</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><kbd>Arthur\ D<kbd title="tab">⇥</kbd></kbd></td>
      <td><samp>Arthur\ Dent</samp></td>
    </tr>
  </tbody>
</table>

Readline will still see the space
as an argument boundary.
Our completion function will be passed `"D"`,
when we want it to be passed `"Arthur\ D"`.

To handle this,
we need to give Readline
a pointer to a function that can tell it
if the space between words is escaped,
which we can do with the
[`rl_char_is_quoted_p`][docs-rl_char_is_quoted_p]
setting:

<figure>
<pre><code class="c">rl_char_is_quoted_p = &amp;quote_detector;</code></pre>
<figcaption>
  <a href="https://github.com/georgebrock/readline-example/blob/5219206/main.c#L26">
    From <code>main.c</code> at revision 5219206
  </a>
</figcaption>
</figure>

Our `quote_detector` function
takes the whole line of input
and the index of the space
that might indicate a break between arguments,
or a quote character that might indicate the start of a quoted string.
It should return zero
if the character isn't quoted,
and a non-zero value
if it is quoted:

<figure>
<pre><code class="c">int
quote_detector(char *line, int index)
{
    return (
        index > 0 &&
        line[index - 1] == '\\' &&
        !quote_detector(line, index - 1)
    );
}</code></pre>
<figcaption>
  <a href="https://github.com/georgebrock/readline-example/blob/5219206/main.c#L111-L119">
    <code>quote_detector</code> from <code>main.c</code> at revision 5219206
  </a>
</a>
</figcaption>
</figure>

It's worth noting that this implementation is recursive.
In many shells,
it's possible to escape the `\` character with another `\` character.
The sequence `\\` represents a literal `\`
and doesn't escape the character that follows it.
The recursion makes sure we handle
any number of `\` characters before a space,
and always do the right thing.

#### When is `rl_char_is_quoted_p` called?

The Readline documentation
would have us believe that there's nothing else we need to do.
The reality is a little more complex.

Readline won't make use of `rl_char_is_quoted_p`
unless it believes some kind of quoting or escaping
is being used in the user's input.
Remember our old friend `rl_completion_quote_character`?
We used it to determine if we needed to escape our completion options.
Readline does something similar
with the closely related
[`rl_completion_found_quote`][docs-rl_completion_found_quote]
variable to determine
if it needs to call `rl_char_is_quoted_p`<sup><a href="#fn3" id="r3"
title="Footnote 3">3</a></sup>.

There are several practical implications of this:

* `rl_completion_found_quote` is only ever set
  if `rl_completer_quote_characters` is set.
  Therefore, without `rl_completer_quote_characters`,
  `rl_char_is_quoted_p` does nothing.

* `rl_completion_found_quote` is only ever set
  if the input contains an unclosed quoted string,
  or a literal `\` character.
  This limits the kind of escaping schemes
  `rl_char_is_quoted_p` can implement
  to those that use a `\` in some way.

#### Which characters separate arguments?

Readline will only invoke `rl_char_is_quoted_p`
with characters that would,
if unescaped,
indicate a break between arguments.

For our `quote_detector` implementation to work,
we need to customise the list of word break characters:

<figure>
<pre><code class="c">rl_completer_word_break_characters = " ";</code></pre>
<figcaption>
  <a href="https://github.com/georgebrock/readline-example/blob/5219206/main.c#L25">
    From <code>main.c</code> at revision 5219206
  </a>
</figcaption>
</figure>

Notice that we've been happily completing space-separated arguments
from the very first example,
so why do we need to explicitly specify this now?

The default value of
[`rl_completer_word_break_characters`][docs-rl_completer_word_break_characters]
includes the `\` character,
which we use for escaping.
If encountering a `\` indicated a word break,
we wouldn't get very far with escaped spaces;
Readline would include the space in the value
passed to our completion function,
but stop at the `\`.

An alternative solution to this problem
would be to decrement [`rl_point`][docs-rl_point]
in our `rl_char_is_quoted_p` function,
but since we don't need `\` characters
to act as word breaks,
we can happily remove them
from `rl_completer_word_break_characters`.

## That's all, folks

So far,
that's everything we're using in gitsh.
But we're still only scratching the surface
of what GNU Readline can do.

---

## Update: Tab completion in Ruby

When I wrote this post,
many of the features it covered weren't available
via Ruby's `Readline` module.

A couple of patches later,
and all of this is possible in Ruby.

Check out the [Ruby edition of this post][ruby-edition]
to see the same example without a single line of C.

[ruby-edition]: https://thoughtbot.com/blog/tab-completion-in-gnu-readline-ruby-edition

---

<a href="#r1" id="fn1">[1]</a>
gitsh is mostly implemented in Ruby, and
until very recently we used
[Ruby's built-in `Readline` module][ruby-readline].
The default Ruby bindings only expose a subset of
Readline's functionality—it's a very useful subset,
but gitsh has now outgrown it.
In the gitsh source,
we expose the features discussed in this post via a Ruby extension,
and then make use of them from Ruby.
To keep things simple
I'll stick to C in this post,
but you can see the full Ruby implementation
in
[gitsh's `line_editor.c` file][gitsh-line-editor].

<a href="#r2" id="fn2">[2]</a>
We could be more memory efficient here,
and avoid calling `strdup` for strings that don't match the user input,
but the code would be harder to read.
I'm generally in favour of sacrificing a little efficiency
for readability,
and doubly so in examples.

<a href="#r3" id="fn3">[3]</a>
To be more precise,
the value of a local variable called `found_quote` is used
to determine if `rl_char_is_quoted_p` should be called
before it's assigned to
the externally accessible `rl_completion_found_quote`.
See the `_rl_find_completion_word` function definition
in `lib/readline/complete.c` in the GNU Bash source code for details.

[ruby-readline]: http://ruby-doc.org/stdlib-2.3.1/libdoc/readline/rdoc/Readline.html
[gitsh-line-editor]: https://github.com/thoughtbot/gitsh/blob/master/ext/gitsh/src/line_editor.c

[docs-readline]: https://cnswww.cns.cwru.edu/php/chet/readline/readline.html#IDX200
[docs-rl_attempted_completion_function]: https://cnswww.cns.cwru.edu/php/chet/readline/readline.html#IDX361
[docs-rl_attempted_completion_over]: https://cnswww.cns.cwru.edu/php/chet/readline/readline.html#IDX388
[docs-rl_char_is_quoted_p]: https://cnswww.cns.cwru.edu/php/chet/readline/readline.html#IDX364
[docs-rl_completer_quote_characters]: https://cnswww.cns.cwru.edu/php/chet/readline/readline.html#IDX375
[docs-rl_completer_word_break_characters]: https://cnswww.cns.cwru.edu/php/chet/readline/readline.html#IDX373
[docs-rl_completion_found_quote]: https://cnswww.cns.cwru.edu/php/chet/readline/readline.html#IDX383
[docs-rl_completion_matches]: https://cnswww.cns.cwru.edu/php/chet/readline/readline.html#IDX357
[docs-rl_completion_quote_character]: https://cnswww.cns.cwru.edu/php/chet/readline/readline.html#IDX381
[docs-rl_point]: https://cnswww.cns.cwru.edu/php/chet/readline/readline.html#IDX203
