Crafting History With Rebase

Video

Want to see the full-length video right now for free?

Notes

One of the most powerful features of Git is how it separates the acts of coding and version control. There's no central server to sync with, no locking files so you teammates can't edit them while you do, in fact there is almost nothing required before you start coding.

Instead, you can simply go to work, coding and figuring things out as comes natural. Whenever you're ready you can then focus on version control and commit your changes.

While this kind of freedom is invaluable, it doesn't always lead to the most straightforward history. Luckily for us, Git provides many ways to craft our history, arranging it to tell the story of building a feature as we'd want, rather than being stuck with the wandering sequence of changes we initially created.

In this video we'll cover these techniques, including selective staging, cherry picking, and all forms of rebase.

Add Patch

Often when we're ready to commit our changes, we'll run git add . or git add --all, but these are a bit coarse. We can use the more focused form where we name files or directories to stage, for instance git add Gemfile, but even this can be too coarse if we have distinct change sets within a single file.

Instead, we can use the --patch flag to tell Git that we would like to review each group of changed lines individually, and choose whether or not to stage them. The full command is run as:

$ git add --patch

When running git add --patch, Git will split the changes in each file into "hunks" and present them one at a time, prompting you for how to proceed. The primary operations are to stage the hunk, skip it, or split it into smaller hunks if possible. Below is a collection of some of the more common operations (this list can also be seen by typing h at the prompt):

Key	Operation
`h`	Display the list of available keys and their operation
`y`	Stage the current hunk
`n`	Skip this hunk
`s`	Split the hunk
`a`	Stage this and all remaining hunks
`q`	Quit, skipping all remaining hunks
`e`	Edit the hunk manually, allowing for line be line staging

Cherry Pick

In some cases we'll find that we've made a commit on the wrong branch. We might have checked out master and made changes there, rather than our feature branch. Luckily Git provides a command, cherry-pick, that allows us to copy commits onto a different branch.

Assume we have two commits made to our master branch and we want to move them onto our sanitize-search-query feature branch. We can do this by first identifying the commits we want to bring across. We could specify them by their commit hashes, but an easier option is to reference them as the range from origin/master up to master. We can confirm that we've specified the range properly by running either git diff or git log with the range:

$ git diff origin/master..master
# confirm that the diff contains all the expected changes

or 

$ git log origin/master..master
# confirm that only the expected commits are listed

From there, we can check out our feature branch and run the cherry-pick command, passing the range as our argument:

$ git checkout sanitize-search-query

$ git cherry-pick origin/master..master
# git will create a new commit for each in the range and output a summary of
# the new commits

Reset Hard

Note, with the above cherry-pick operation, as with all Git operations, Git does not destroy or edit commits, but instead simply creates new ones. This means that the two commits as authored are still on our master branch.

We can solve this by reseting our master branch to align it with origin/master, essentially erasing the original commits (although Git still remembers just in case; Git's got our back).

$ git checkout master

$ git reset --hard origin/master

Rebase

Now that we've seen cherry-pick, we can tackle the oft-dreaded rebase. It turns out that a standard rebase is essentially identical to the cherry-pick workflow that we've just shown.

When we rebase, we do it from the context of our feature branch, and we specify the target branch we want to to rebase onto. So the command name "rebase" actually does a great job of explaining what the command does; it "re bases" or bases again. We want to take the work we've done on our feature branch, and reapply it as if it was done on top of the additional commits in our target branch.

As an example, let's assume we have a branch that when started was based off of master. We've made some changes on our branch and now have two new commits, but at the same time our colleagues have also made changes on the master branch.

$ git branch
* faq-updates
  master

$ git rebase master

When performing the rebase, Git finds the commits unique to our branch and computes the diff off the changes they introduced, then moves to the target branch, master in this case, and one by one applies the diffs, creating new commits reusing the commit messages from our branch. Once done, it updates our branch to point at the newest of these commits created by reapplying the diffs.

Interactive Rebase

And now we're ready to move on to our final form of history crafting which is the surprisingly-powerful interactive rebase command. Unfortunately it shares a name and command with the standard rebase we just reviewed, but for practical purposes it's actually quite different.

With interactive rebase we're not moving our commits to a different point in history, but instead revising the given range of commits in place. The most common usage is to combine or squash commits together, but you can even go so far as to entirely delete or reorder commits.

Some folks will say that revising history is an unforgivable sin, but we here at thoughtbot consider it an important part of our workflow. While we would never revise published history, specifically the master branch, we almost always revise our commits on feature branches before merging them in. We value a clean history, and the majority of the time, the commits in a feature branch contain many rounds of refactoring and PR reviews which we don't want in the permanent history. Instead, we want the most direct and concise form of the history that fully captures the change we settled on in our feature branch after completing any refactoring or updates.

To demonstrate, we'll again come back the faq-updates and can see that the history currently is as follows:

$ git log --oneline --decorate -5
* 009ac91 (HEAD -> faq-updates) Add other file
* c0a3941 Remove line in README
* be891be WIP add forum answer to FAQ
* a915fac (origin/master, origin/HEAD, master) Add a search page using...
* 85ff650 Update skylight gem

From this view, we can see that we have 3 commits on on our faq-updates branch, and they are directly ahead of master. Using interactive rebase, we can squash these down into a single commit:

$ git rebase -i master

Git will then open our editor with a file for us to edit, similar to how commit messages are composed, and in this file the commits we can operate on are presented, one per line, in reverse chronological order. We can then choose from an array of operations, but the most common is simple to "squash" the commits below the first commit.

pick be891be WIP add forum answer to FAQ
s c0a3941 Remove line in README
s 009ac91 Add other file

Squashing has the effect of keeping the working tree for a commit, but folding the commit object itself back into the previous so we are left with only a single commit for which the working directory is taken from the last of the "squash" commits.

Mastering Git

19 minutes