Indexes give the database server a much faster way to lookup and retrieve
specific rows. Postgres offers several different types of indexes, each of which performs best for certain types of WHERE
clauses.
Standard comparisons (B-Tree Index)
Postgres will default to using a B-Tree for an index that doesn’t specify the index type.
--This index is created using a B-Tree Index
CREATE INDEX email_idx ON users (email);
Most of our day to day querying relies on comparison operators and the Postgres default works great for these. Comparison operators include:
=
,<>
,!=
<
,>
,<=
,>=
DISTINCT
,NOT DISTINCT
IS NULL
,IS NOT NULL
If you’re creating an index for a column expected to make heavy use of these operators, you’re all set. Thanks, Postgres!
Inclusion (B-Tree Index)
As with standard comparisons, the default B-Tree index type works great for WHERE
clauses checking for inclusion with INCLUDES
or BETWEEN ... AND ...
.
Pattern matching with LIKE
/ILIKE
(GIN Index)
A lot of web apps have a search feature and a lot of search features get
implemented using LIKE
or its case-insensitive sibling ILIKE
. In most of these queries you’re likely using
an “unanchored” expression where both the right and left edges of the search string are variable.
--This LIKE expression is "unanchored"
SELECT * FROM films WHERE title LIKE '%query%';
--Compared to this one, which is "left anchored"
SELECT * FROM films WHERE title LIKE 'query%';
--Or to this one, which is "right anchored"
SELECT * FROM films WHERE title LIKE '%query';
If you’re using right anchored or unanchored LIKE
/ILIKE
expressions you’ll benefit greatly from a
GIN
index that uses a trigram operator class. This requires a bit more work
than creating a run of the mill index.
First, you’ll need to include the trigram module (pg_trgm
) in your database
if it isn’t already. Then you can create a GIN
index for your column that specifies the
trigram operator classes from that module.
--Install the pg_trgm module if it isn't already
CREATE EXTENSION IF NOT EXISTS pg_trgm;
--Create a GIN index specifying the trigram operator class (`gin_trgm_ops`)
CREATE INDEX title_idx ON films USING GIN (title gin_trgm_ops);
If you’re certain you’re only using left anchored expressions, e.g. LIKE
'query%'
, then you should be all right sticking with the B-Tree defaults so
long as you’re running your database with the C
locale.
Pattern matching using SIMILAR TO
, NOT SIMILAR TO
, ~
, ~*
and regex friends (GIN Index)
Using regular expressions to find matches? Everything that applies to the
LIKE
/ILIKE
queries applies to you, too. If your regular expressions are
left anchored you can use the defaults, but if you’re using unanchored
expressions, opt for a GIN index.
--These are left anchored expression
SELECT * FROM films WHERE title SIMILAR TO 'query%';
SELECT * FROM films WHERE title ~ '^query';
--While these are unanchored expressions
SELECT * FROM films WHERE title SIMILAR TO '%query%';
SELECT * FROM films WHERE title ~ 'query';
--Create a GIN index specifying the trigram operator class (`gin_trgm_ops`)
CREATE INDEX title_idx ON films USING GIN (title gin_trgm_ops);
Similarity using similarity()
and its operators including %
, <%
, %>
and others (GIN)
Where SIMILAR TO
and other regex clauses return either true or false, the similarity()
function provided by the pg_trgm
module allows for a more qualified assessment of matches.
This function will tell you how similar two strings are on a scale that goes from 0.0 (totally dissimilar) to 1.0 (completely similar).
The similarity()
function uses trigrams to determine these ratings and benefits dramatically from
adding a GIN
index. As with LIKE
and our regex friends, your GIN
index
should specify the gin_trgm_ops
operator class.
--Create a GIN index specifying the trigram operator class (`gin_trgm_ops`)
CREATE INDEX title_idx ON films USING GIN (title gin_trgm_ops);
What’s next
The suggestions above will handle the majority of scenarios you will see in the wild. For niche situations, Postgres also offers several other index types such as BRIN
and HASH
. There are also some interesting edge-cases where a GiST
can outperform GIN
. The Postgres index docs are a great place to continue your optimization journey.