---
title: Joining Postgres tables using arrays of IDs
teaser: Some tips on how to do many-to-many SQL queries without a normal join table.
tags: databases,postgresql,sql,prisma,airtable
author: Dave Iverson
published_on: 2023-12-05
---

I'm working on a project with a slightly abnormal Postgres database structure. We're using a lovely tool called [Sequin][sequin] to sync data into Postgres from [Airtable][airtable].

In Airtable “linked fields” are represented as arrays of IDs: each ID pointing to a record in another table. That means that our Postgres database doesn't have standard many-to-many join tables. On a many-to-many relationship (e.g., Vendors <-> Industries) each row in the Vendors table will have an array of Industry IDs, and each row in Industries will have an array of Vendor IDs.

```sql
CREATE TABLE "industries" (
    "id" text NOT NULL,
    "name" text,
    "vendors" text[],
    PRIMARY KEY ("id")
);

CREATE TABLE "vendors" (
    "id" text NOT NULL,
    "name" text,
    "industries" text[],
    PRIMARY KEY ("id")
);
```

I've found a couple ways to do joins on this schema.

## Joining with the `ANY` array operator

The `ANY` operator will match any item in the array. You use it in a join like this.

```sql
SELECT
	vendors.id,
	vendors.name,
	industries.name AS industry_name
FROM vendors
JOIN industries ON industries.id = ANY (vendors.industries);
```

This will output a row for each vendor/industry combination.

| id                | name                 | industry_name                |
|-------------------|----------------------|------------------------------|
| recYDX6fBzZebC0ZC | Gumbs Partners       | HR Services                  |
| recjPKDyJMfu0Y6Ri | Lily of the Valley   | Floral Design	Floral Design  |
| recjPKDyJMfu0Y6Ri | Lily of the Valley   | Floral Design	Event Services |
| recdGYB0t9U7LozYj | NXTevent, Inc.       | Event Services               |
| receoJe6XpK7Sk8TZ | We Grow Microgreens, | LLC	Food & Beverage          |
| receoJe6XpK7Sk8TZ | We Grow Microgreens, | LLC	Event Services           |
| recm9iAPWPJSNhf4S | Manifested Events    | LLC	Event Services           |

## Joining with a Postgres view

Most ORMs will struggle to create a join on an array. I'm using [Prisma][prisma] which doesn’t understand that an array of strings can be used as a join to another table. It requires a join table for many-to-many relationships.

We can make a "virtual" join table using a database view, with Postgres’s `unnest()` function.

Credit goes to the folks at [Sequin][sequin] for suggesting this technique for working with Airtable data.

`unnest(array_column)` turns an array into rows, where each element of the array gets a row.

Here we'll use [Prisma's naming format][prisma-join-convention] for join tables.

```sql
CREATE VIEW _IndustryToVendor AS
SELECT
  vendors.id AS "B",
  unnest(vendors.industries) AS "A"
FROM
  vendors;

```

This is the resulting join table:

| A                 | B                 |
|-------------------|-------------------|
| rechodXXUzIyLWG1E | rec1nNDyfSeshQhVM |
| reczbFISIU5fu5Fda | recihNylMXO81fDgs |
| recyz3Dn3r8ML9UR3 | rec1yOytCf7VM5nZA |
| recyfeS5rXGahgh7H | rec1nNDyfSeshQhVM |
| reclBbzlO9pyaE1BV | recihNylMXO81fDgs |
| rec4cAq4w5EYg8GId | rec1nNDyfSeshQhVM |

And you can use it for a join like this:

```sql
SELECT
	vendors.id,
	vendors.name,
	industries.name AS industry_name
FROM vendors
JOIN "_IndustryToVendor"
ON "_IndustryToVendor"."B" = vendors.id
JOIN industries
ON "_IndustryToVendor"."A" = industries.id
```

This SQL query gives the same result as the `ANY` query above.

Prisma prefers for its join table to have foreign keys, but it’s not absolutely required. Since views can’t have foreign keys, we’ve done without.

## Success!

Using the "JOIN ON ANY" technique I can easily write ad-hoc many-to-many join queries.

And using the "view as a join table" technique I can set up many-to-many joins in the Prisma ORM.

## A note on performance

I'm not a DB performance expert, but I doubt these techniques will be as performant as a real many-to-many join table with indexes and foreign keys.

An `EXPLAIN` shows that the first technique (joining on `ANY`) needs to make a sequential scan of the industries table. I know you can add a GIN index on an array column, but I don't know if that would improve anything.

According to `EXPLAIN` the second technique (using a join view) needs to do a sequential scan on the industries table (because it's referenced in the view), but it's able to do an index scan on the industries table.

I'm curious if using a materialized view would improve performance.

Regardless, we haven't encountered issues with slow queries yet, so I'm happy with these solutions.

[prisma-join-convention]: https://www.prisma.io/docs/concepts/components/prisma-schema/relations/many-to-many-relations#conventions-for-relation-tables-in-implicit-m-n-relations
[sequin]: https://sequin.io/
[airtable]: https://airtable.com/
[prisma]: https://www.prisma.io/
