Indexes give the database server a much faster way to lookup and retrieve
specific rows. Postgres offers several different types of indexes, each of which performs best for certain types of
Postgres will default to using a B-Tree for an index that doesn’t specify the index type.
--This index is created using a B-Tree Index CREATE INDEX email_idx ON users (email);
Most of our day to day querying relies on comparison operators and the Postgres default works great for these. Comparison operators include:
IS NOT NULL
If you’re creating an index for a column expected to make heavy use of these operators, you’re all set. Thanks, Postgres!
As with standard comparisons, the default B-Tree index type works great for
WHERE clauses checking for inclusion with
BETWEEN ... AND ....
A lot of web apps have a search feature and a lot of search features get
LIKE or its case-insensitive sibling
ILIKE. In most of these queries you’re likely using
an “unanchored” expression where both the right and left edges of the search string are variable.
--This LIKE expression is "unanchored" SELECT * FROM films WHERE title LIKE '%query%'; --Compared to this one, which is "left anchored" SELECT * FROM films WHERE title LIKE 'query%'; --Or to this one, which is "right anchored" SELECT * FROM films WHERE title LIKE '%query';
If you’re using right anchored or unanchored
ILIKE expressions you’ll benefit greatly from a
GIN index that uses a trigram operator class. This requires a bit more work
than creating a run of the mill index.
First, you’ll need to include the trigram module (
pg_trgm) in your database
if it isn’t already. Then you can create a
GIN index for your column that specifies the
trigram operator classes from that module.
--Install the pg_trgm module if it isn't already CREATE EXTENSION IF NOT EXISTS pg_trgm; --Create a GIN index specifying the trigram operator class (`gin_trgm_ops`) CREATE INDEX title_idx ON films USING GIN (title gin_trgm_ops);
If you’re certain you’re only using left anchored expressions, e.g.
'query%', then you should be all right sticking with the B-Tree defaults so
long as you’re running your database with the
Using regular expressions to find matches? Everything that applies to the
ILIKE queries applies to you, too. If your regular expressions are
left anchored you can use the defaults, but if you’re using unanchored
expressions, opt for a GIN index.
--These are left anchored expression SELECT * FROM films WHERE title SIMILAR TO 'query%'; SELECT * FROM films WHERE title ~ '^query'; --While these are unanchored expressions SELECT * FROM films WHERE title SIMILAR TO '%query%'; SELECT * FROM films WHERE title ~ 'query'; --Create a GIN index specifying the trigram operator class (`gin_trgm_ops`) CREATE INDEX title_idx ON films USING GIN (title gin_trgm_ops);
SIMILAR TO and other regex clauses return either true or false, the
function provided by the
pg_trgm module allows for a more qualified assessment of matches.
This function will tell you how similar two strings are on a scale that goes from 0.0 (totally dissimilar) to 1.0 (completely similar).
similarity() function uses trigrams to determine these ratings and benefits dramatically from
GIN index. As with
LIKE and our regex friends, your
should specify the
gin_trgm_ops operator class.
--Create a GIN index specifying the trigram operator class (`gin_trgm_ops`) CREATE INDEX title_idx ON films USING GIN (title gin_trgm_ops);
The suggestions above will handle the majority of scenarios you will see in the wild. For niche situations, Postgres also offers several other index types such as
HASH. There are also some interesting edge-cases where a
GiST can outperform
GIN. The Postgres index docs are a great place to continue your optimization journey.