Rather than go in depth on everything a unix-like operating system contains, this is instead a baseline of commands and concepts that you need to navigate your command-line shell in a professional environment.
You should know these concepts:
- Terminals and shells.
- Tab completion.
- Globs.
- Pipes.
- Program success and failure.
- Environment variables.
- Searching shell history.
And when to use these commands:
cat
cd
cp
echo
file
grep
kill
less
ls
man
ping
ps
rm
ssh
sudo
tail
What follows is a brief, high level guide to these concepts and commands.
Concepts
The command line
You have a program called a terminal emulator. On macOS it’s called Terminal. It emulates a physical machine from the 1970s and 80s called a terminal. The terminal was not a computer; it was a keyboard and a screen, one of many, connected to a central computer shared by all other terminals. Your terminal app emulates that.
(The terminals of the 80s actually mimick earlier terminals, which were a keyboard and a printer. Screens were a new invention.)
The Web uses the same model as the terminals did: a request/response programming model (no AJAX for terminals). Each keypress was a request sent to the central computer – typically just the individual letter, but things like backspace or tab or newline were sent as special codes (0x08, 0x09, 0x0a, respectively, in case you’re curious). The computer would push responses that were mostly individual letters but also encoded additional things like ringing a bell (0x07) or ending transmission (0x04). Yes, it was a physical bell.
So the terminal was like the Web browser of today, so it follows that you still needed to load a program to interact with. That brings us to the shell.
The shell is a simple program that allows you to run other programs
conveniently. As time went on, they got a little fancier. Many shells now have
built-in keywords that look just like programs but actually direct the shell to
perform its own action, such as things like if
and while
. You’ll be
thrilled to know that cd
and export
are also shell built-ins, not programs.
We’ll get into that later.
Tab completion
Your shell will attempt to complete what you’re typing when you press the TAB key. It primarily works on files.
To see the contents of the file browserlist
:
cat br<TAB>
You should be able to tab your way through the majority of any file’s name. Between tab completion and globs (next section), you will rarely type a full file name.
Globs
One fancy shell feature is the ability to specify multiple files easily, using globs. Globs can appear anywhere in your command.
Note that globs are handled by the shell, not by the program. In the following
examples, the ls
program is given a set of file names; it never sees a *
character.
If you want all files within a directory, use the *
glob. In this example,
list all the show.html.erb
templates for all (non-nested) controllers:
ls app/views/*/show.html.erb
If you want all files within any set of subdirectories, **
will traverse
everything. For example, list all base controllers across all namespaces:
ls app/controllers/**/base_controller.rb
To expand into multiple arbitrary strings, use the {}
brace expansion. To
rename a JavaScript file into TypeScript:
mv main.{js,ts}
That is the equivalent of:
mv main.js main.ts
To make a quick backup copy of a file:
cp test.log{,.bak}
That is the equivalent of:
cp test.log test.log.bak
There is more glob syntax such as ?
, [...]
, [-]
, but the above is what
you’ll typically need.
Pipes
By default, command-line programs interact with three streams of data: standard input (stdin), standard output (stdout), and standard error (stderr).
Stdin is typically your keyboard, and stdout and stderr are typically your terminal window – but this does not have to be the case.
First, you can connect stdout to a file using >
. If you want all lines
related to processing a request copied from the log into a file, for example:
grep Processing > requests.log
The >
syntax will overwrite any contents in the output file. To append, use
>>
instead.
Likewise, you can connect stdin to a file using <
.
To work with stderr, you first need to know that every file has a number, and stdin, stdout, and stderr are files. Stdin = 0, stdout = 1, and stderr = 2.
The syntax for sending stderr (2) to stdout (1) is 2>&1
. To create a file
with a list of every file that matches a glob, and also any permission errors:
ls /Users/*/.ssh/id_rsa 2>&1 > rsa-keys.log
You can connect the stdout from one program to the stdin to another program
using the |
character.
For example, to search running programs for Ruby, you can pipe ps
into
grep
:
ps | grep ruby
Program success
Every program has an exit status that it returns to the shell. This is a number that is either zero, or non-zero (less than 256).
Zero means success. Non-zero means failure.
This is typically not visible to you, but it is available in the special $?
variable:
echo $?
Shells include some syntax for combining programs based on success and failure.
The syntaxes to look at are &&
and ||
.
To delete a file so long as it contains a string:
grep shhhh-secret Gemfile && rm Gemfile.lock
To re-run a command with privileges if needed:
rm /etc/passwd || sudo rm /etc/passwd
Environment
Your operating system maintains a global dictionary called the environment. You can set and access environment variables to manipulate this.
(If you’re thinking “doesn’t a global dictionary interact poorly with threads?”, let me tell you: it also interacts poorly with multiple processes. It’s been a disaster since 1976.)
Your shell has its own private dictionary which you can then export into the global dictionary. When you run a command from your shell, it does not gain access to the shell’s private environment dictionary. You must export it first.
For example, you can set DISABLE_SPRING
from the shell:
DISABLE_SPRING=1
But Rails won’t see that until you export it:
export DISABLE_SPRING
You can combine this into one line:
export DISABLE_SPRING=1
Note that DISABLE_SPRING
is the name of the variable but $DISABLE_SPRING
is the value:
echo $DISABLE_SPRING
Different programs use environment variables in totally different ways, and you need to learn about them from their respective manuals. You set and export them from the shell, but they are used by other programs.
One important environment variable is PATH
. This is a colon-separated list of
directories that your shell will look in to find programs.
If your PATH is:
PATH=/usr/bin:/usr/local/bin:bin
Then when you run webpack
, your shell will look for /usr/bin/webpack
, then
/usr/local/bin/webpack
, and finally bin/webpack
.
History
You can press the up arrow to re-run the prior command. Excellent.
But instead, you can enter an interactive search for the command you want by pressing control-r. Start typing, and if you get the command you want then hit enter. Press control-r to keep searching back in history.
Press control-c to give up and exit the search.
To see the history, use the history
built-in.
Commands
These are the commands that you will need in order to interact with files and your operating system:
cat
The cat
program lists the contents of files. You can give multiple file names
and it will concatenate them.
cat Procfile Rakefile
cd
Each shell instance has its own idea of the present working directory. When you use tab completion, your shell works off its idea of which directory you are “in”.
You can change that using the cd
(change directory) built-in. This has no
effect on the OS; it’s a shell built-in.
cd hub
The cd
built-in accepts a single -
as its argument, which causes it to
change to the prior directory.
cd -
cp
The cp
program copies a file. To copy production.rb
to staging.rb
:
cp config/environments/production.rb config/environments/staging.rb
Or, using brace expansion:
cp config/environments/{production,staging}.rb
If you pass multiple arguments, the final argument must be a directory. It will
copy all the preceding arguments to that directory. To copy all Markdown files
into the doc
directory:
cp *.md doc
echo
The incredibly simple echo
program prints its arguments to stdout. It’s a fun
exercise to write this program yourself.
It is mainly used in scripts to show messages to the user:
echo "Loading production data, please wait ..."
Note that you can send the message to stderr using redirects:
echo "Failed to restore database." >&2
file
The file
program tells you what kind of file something is. You ever download
a JPEG but it turns out to be a Webp? file
will tell you that.
file hatchet.jpg
grep
The grep
program searches files for a regular expression. With the advent of
Ripgrep, the most common file to grep is stdout.
ps | grep ruby
The pattern can be a complex regex, but note that you are fighting shell quoting the whole time. Use single quotes if you’re at all cautious:
ps | grep '[cC]hrome'
The grep
program can display context around a match with the -C
flag:
grep -C2 dependencies yarn.lock
kill
The kill
program sends a signal to another process.
A unix OS has a basic inter-process communication system where you can send any of two dozen one-byte messages (signals) to a program. Some notable signals are HUP (1), INT (2), KILL (9), TERM (15), STOP (19), and CONT (18).
The TERM, INT, and KILL signals all tell the process to stop. The INT (interrupt) signal is the most polite, then TERM (terminate), then KILL (immediately stop). Programs can trap the INT and TERM signals to perform cleanup, but cannot trap the KILL signal.
The STOP signal will pause the program, and the CONT signal will unpause.
The HUP signal traditionally indicates that the long-running program should re-read its configuration file.
To work with the kill
program, you need a process ID. You can get that using
ps
or pgrep
. Once you have that, you can send a signal to the process.
To tell Postgres, at process ID 42069, to reload the server configuration files:
kill -HUP 42069
The pkill
program works just like kill
but takes a program name instead of
a process ID. This is a more dangerous game unless you specifically know how
many matching processes there are and how to best write a pattern for their
name. Typically by the time you finish your investigation, you have a process
ID in hand.
To tell Postgres to cancel a running query:
pkill -INT postgres
less
The less
command paginates a file. Its name is a joke: there was originally a
program called more
, but some people wanted to improve it but refused to
collaborate with the more
developer. They called their improvement less
. It
was better, and now it’s all we use.
(There is now a replacement for less
called most
.)
Anyway, you can either use it in a pipe:
grep http Gemfile.lock | less
Or use it on a file directly:
less package.json
By default, less
will show you the bytes in a file. This is typically fine –
the file is a bunch of ASCII.
However, sometimes a file or stdout contains instructions for a terminal. For
example, color codes are a feature from the terminals in the 80s that work by
sending ESC (0x1b), a number, and then m
; the number determines the color to
show. But if less
is just showing you bytes, you’re going to see that ESC
instead of the color.
The solution is to pass the -R
flag. This will cause less
to show the
colors as colors:
less -R log/development.log
ls
The ls
program lists files. It comes in two flavors.
Basic:
ls bin
And long:
ls -l config
By default it will list files in the current working directory:
ls -l
It skips any file whose name begins with a dot (.
). To see those, too, pass
the -A
flag:
ls -A
man
Unix programs ship with a manual installed on the computer. Back in the day this was called the online help. Now it’s called the offline docs. Words!
You can access the manual with the man
program.
man ls
The manuals are categorized into sections. Section 1 is for normal programs. Section 8 is administration programs that can break your computer if you’re not careful. Section 7 contains overviews, tutorials, and concept explanations.
You’ll often see programs written as rm(1)
or useradd(8)
. The number is the
section.
On GNU/Linux we have a glob(3)
and a glob(7)
. Section 3 is C functions. To
access the manual glob(7)
:
man 7 glob
For more details, see man(1)
!
ping
The ping
command is a quick way to know whether your Internet connection
is slow.
ping thoughtbot.com
Press control-c to end the pings by sending an INT signal.
ps
The ps
program lists processes.
Every program is at least one process. Some programs, like Google Chrome, spawn multiple processes.
By default ps
only shows processes that are in your current terminal session.
That’s usually two: your shell, and ps
itself.
I usually run ps ax
. You can also give ps -e
a try. Either way, you’ll want
to pipe it to either grep
or less
.
The ps
program can also tell you memory usage. To use this, first find the
process ID you want using ps | grep
, and then use the -o rss
flag to see
the Resident Set Size (RSS) in kilobytes for that process ID.
ps -o rss 12345
rm
The rm
program removes files and directories. By default it will not remove a
directory with files in it; pass the -r
flag to recur.
rm foo
rm -r tmp/*
Unix hot tip: almost every program across all flavors of unix will stop
interpreting command line flags once they encounter --
. Anything after --
is treated as raw text, and nothing more.
So if you accidentally create a file named -rf
, you can remove it like so:
rm -- -rf
ssh
The ssh
Secure SHell program allows you to connect to another computer over
an encrypted connection.
To connect as ralph
to thoughtbot.com
:
ssh ralph@thoughtbot.com
The authentication is done via public/private keypair. You can run the
ssh-keygen
command to generate your keypair. You only need to run that once.
This is a set of files named something like .ssh/id_rsa
and
.ssh/id_rsa.pub
. The .pub
file is your public key. Do not share the other
file!
Under the hood, this is how git connects to GitHub. That’s why the clone command looks like:
git clone git@github.com:...
This is why GitHub wants your SSH public key.
SSH allows you to trust computers using the Trust On First Use (TOFU) principle: the first time you connect, it shows you the other computer’s fingerprint and asks whether you trust it. If you say yes, then it won’t ask again.
You can usually find the other computer’s fingerprint in the support section of for the computer you’re connecting to, such as Heroku’s or GitHub’s support docs.
sudo
Unix popularized the idea of one computer having multiple users. This is how
the terminals worked, afterall. The sudo
program allows you to switch between
these users for the duration of one command.
By default, it switches to the administrator user (root
).
sudo vi /etc/passwd
Not baseline knowledge but if you find yourself redirecting to files while
using sudo
, look into tee
.
tail
The tail
program shows the last 10 lines of a file.
The -f
flag will show the last 10 lines and then continue to show lines as
they come in. Handy for logs:
tail -f log/test.log