Understanding SQL SELECT and WHERE

As web developers, we deal with SQL often. Perhaps it’s mostly abstracted away from us using an ORM like ActiveRecord. But sometimes we find ourselves trying to get more complex data and reach for SQL. We hit an error and scratch our heads. Without a proper mental model of SQL, it’s very difficult to figure out what to do. Its syntax doesn’t share as much with languages like Ruby or JavaScript as we might like. In this three part series, I hope to give you strong mental models for three aspects of SQL: WHERE and SELECT, JOIN, and GROUP BY.

Test Data

PostgreSQL provides some sample datasets that we can use. For this post, I’m using the world dataset. If you have a postgres server running, here’s how you can download, unpack, create, and import this dataset:

$ curl -O https://ftp.postgresql.org/pub/projects/pgFoundry/dbsamples/world/world-1.0/world-1.0.tar.gz
$ tar -zxvf world-1.0.tar.gz
$ createdb world
$ psql world < dbsamples-0.1/world/world.sql

Then, we can connect to our new database:

psql world

The dataset has three tables, city, country, and countrylanguage. city and country are lists of places around the world and countrylanguage lists the languages spoken in a given country. You can take a look around at what kind of data we have using \d or \d [table name].

WHERE is for Rows

When querying, we can use the WHERE clause to filter the rows we get back. To use WHERE we specify a condition like “population less than 1000”:

FROM city
WHERE population < 1000;

I visualize the result of this query by highlighting every row where the value is less than 1000:

Visualize the SQL where clause A table showing 5 cities, 3 of which are highlighted because their population is less than 1000

SELECT is for Columns

Conversely, if we want to get back only certain pieces of data (i.e. columns) of each row, we use SELECT:

SELECT name, countrycode
FROM city;

Similarly, I like to visualize this by highlighting just the columns of the city table:

Visualize the SQL select clause A table showing the name and country code for 5 cities. The ID column is not selected.

Both at the Same Time

And the real power of SQL is that we can combine both together:

SELECT name, countrycode
FROM city
WHERE population < 1000;

Visualize SQL select and where clauses A table showing data for 5 cities. The name and country code columns are highlighted, as are the rows for cities with a population less than 1000.

Interestingly, this selects two columns (name, countrycode), and uses a third column (population) to filter the data. This works because WHERE is run before SELECT.

Parting Thoughts

Maybe you already knew this but needed a bit of practice. Play around with the results a bit and try some other queries. You can do multiple WHERE queries by using AND like: WHERE population < 1000 AND countrycode = 'AIA'

In part 2, we’ll talk about combining two tables together with JOIN, and part 3 will cover GROUP BY.