Want to see the full-length video right now for free?Sign In with GitHub for Free Access
Welcome back to our tour of the Git object model. In this video, we'll go beyond the base objects and look at more of the structure with tags, branches, and remotes, as well as reviewing how the various Git commands act on this collection of objects.
Note - If you haven't watched the First Part of this review of the Git object model, we highly recommend you go back and do that now, as this video largely builds on that foundation.
Before adding more to our growing picture of the Git object model, let's quickly review the base objects we covered in the first video:
Returning to our peek into the
.git directory, we can first review the layout:
$ tree .git -L 1 .git ├── COMMIT_EDITMSG ├── HEAD ├── config ├── description ├── hooks/ ├── index ├── info/ ├── logs/ ├── objects/ └── refs/
In the previous video, we focused primarily on the
objects directory, which
acted as a database of the
commit objects we created as
we worked with our repo.
In this video, we'll instead focus on the
refs directory. Peeking inside,
$ tree .git/refs .git/refs ├── heads/ | └── master └── tags/
The first directory we'll encounter within the
refs directory is
These are our local branches. The directory is called
heads, as our local
branches are the collection of things that
HEAD can point at.
the ultimate ref, defining what we currently have checked out.
heads directory only contains a single file,
master file a "file", rather than some more complex Git object,
because that it what it is. We can test this by
cating it out:
$ cat .git/refs/heads/master f95b2fe3b64c6351e7eec4011921b4469098b9ba
Here we can see that the file contains a string which looks very much like a Git object hash. We can then turn around and ask Git about the object:
$ git cat-file -t f95b2fe3b64c6351e7eec4011921b4469098b9ba commit $ git cat-file -p f95b2fe3b64c6351e7eec4011921b4469098b9ba tree 0cae7dc167b255c0123c7c396fc48ce40fc35cfa parent ef34a153025fffb8a498fff540f7c93963937291 author Chris Toomey <email@example.com> 1441311544 -0400 committer Chris Toomey <firstname.lastname@example.org> 1441311544 -0400 Another file in app dir
Now we have a full picture of what exactly our
is: a file, stored in
.git/refs/heads. Its contents are the
hash of a single commit. We know that commits contain a pointer to the
working tree, as well as parent commits, and now we can add branches to the
list of pointers in our view of the Git world.
Branches are just pointers; nothing more!
Now that we have an understanding of branches, we can shift our focus to
tags. We'll create a tag by running
git tag v0.1, and then we can take
another look at our
.git/refs directory to see what we have:
$ tree .git/refs .git/refs ├── heads/ | └── master └── tags/ └── v0.1
Now we have a new file,
v0.1, stored in the
tags directory. Similar to
master head file, we can cat out the tag file directly to see what it
$ cat .git/refs/tags/v0.1 f95b2fe3b64c6351e7eec4011921b4469098b9ba
Just like our
master file, the
v0.1 file contains nothing more than the
hash of a commit. It is possible for tags, unlike branches, to grow a bit more
complex by adding things like annotations, PGP signatures, and other metadata.
In this case, they will be stored in the
.git/objects directory, and the tag
file will simply contain the hash of that tag object (which will contain the
hash of the commit that was tagged).This just adds one additional step, so we
can still think of tags as simple pointers to commits.
While branches and tags are very similar in that they both simply contain a reference to a commit, they differ in that branches can change what they point at, but tags cannot.
Tags exist to lock down and name ("tag", if you will) a specific version of the code. Branches exist to track the changes in our codebase over time, and will therefore update whenever we commit or merge.
For the small local sample repo we've been working with so far there are no remotes, but we can hop over to the local checkout of the Upcase repo to see an example that contains remotes:
$ tree .git/refs .git/refs ├── heads/ | ├── deck-last-attempt | ├── master | ├── ... (truncated) | └── welcome-trail ├── remotes/ │ ├── origin │ │ ├── HEAD │ │ ├── cjt-north-star-metric │ │ ├── master │ │ ├── mg-button-colors │ │ └── ... (truncated) │ ├── production │ │ └── master │ └── staging │ ├── dashboard-staging │ ├── ... (truncated) │ └── master └── tags/ └── v0.1
With the more real-world example of the Upcase Git repo, we can see that there
is now a third subdirectory alongside
tags in the
remotes directory, there is a directory for each of our remotes,
production. This adds a bit more structure,
but otherwise these objects are the same as our branches. We can confirm this by
investigating the contents of one of these remote branch files:
$ cat .git/refs/origin/cjt-north-star-endpoint 3891a7bc21e5e0c69e71e8153bb8b4a67b80bff5 $ git cat-file -t 3891a7bc21e5e0c69e71e8153bb8b4a67b80bff5 commit $ git cat-file -p 3891a7bc21e5e0c69e71e8153bb8b4a67b80bff5 tree 32022b6465ebf9f9e37b7e1caccb3c9e620dd465 parent 7262141ae317f56b567ed2f95505e6ca9bbe1605 author Chris Toomey <email@example.com> 1433384047 -0400 committer Chris Toomey <firstname.lastname@example.org> 1435239388 -0400 WIP analytics JSON endpoint
Again, we see more of the same. Remote branches are simply pointers to a commit. It's pointers all the way down, friends!
HEAD is the final object we need to be aware of to understand Git.
unlike the other objects we've discussed, is a singleton, meaning that
there is only ever one
HEAD identifies the currently checked out object. Typically, this is a
branch (with that branch pointing to a commit), but it is possible to check
out a commit directly, in which case
HEAD would be pointing at that commit.
HEAD is a file just like our branch objects. It lives at the root of the
.git/ directory and its contents are similarly simple:
$ cat .git/HEAD ref: refs/heads/master
This is the normal mode for Git, where
HEAD points to a branch, in this case
master branch. If we were to check out a commit directly, then
would simply point at that commit:
$ git co 833c1ea $ cat .git/HEAD 833c1ea55d76adcf48b5f7e933271fcc3e36f123
So once again we find ourselves with a pointer.
HEAD points to a branch,
that branch points to a commit, and that commit points to a working tree and
parent commit. Pointers. All. The. Way. Down.
And, with the addition of
HEAD, we have a complete picture of the Git object
Now that we understand the objects that are used throughout Git, we're going to zoom out a bit and focus primarily on commits and refs. Nearly all operations in Git involve commits, although typically these commits are referenced through refs like branches and remotes.
Checking out a new branch is just the act of creating a ref file, specifically a "head", and populating it with the relevant commit hash.
$ git checkout -b new-branch
First Git will follow from the
HEAD to the current branch to determine what
commit hash that branch points at. With that info, Git creates a new file in
.git/refs/headswith our new branch name as the file name, and the commit
hash as the contents. Lastly, it updates
HEAD to point at this new ref.
Similarly, we can use the verbose form of
checkout, where we explicitly
specify the base branch. For instance:
$ git checkout -b other-branch master
is largely the same as the last check out, but instead of starting from
start from the specified branch to determine the commit to point at, and use
that to populate our new ref file.
There's an alternative form of checkout when we check out a file by specifying a ref. Technically, we need a tree to get to a specific version of a file, but Git's pointer system also allows for something to be "tree-ish". When something is tree-ish, it will eventually lead to a single tree by dereferencing the pointers.
A commit is tree-ish because commits point at a single tree for the working directory.
Refs are tree-ish because they point at commits, which point at a tree.
Even HEAD is tree-ish by the same logic.
So if we use the following form of the checkout command:
Git will begin by looking up the commit that
master points at, then
the working tree of that commit, and then walk down through the intermediate
trees until it reaches the blob for
and restore that version of the file.
Committing takes all of the staged objects and stores them as needed. This typically involves at least one new blob, and a new tree for the current version of the working directory.
It then builds a commit object that points at our new tree, as well as the commit we are currently on.
Lastly, it updates our checked out branch to point at this newly created commit.
$ git commit -m "Add new file"
A fast-forward merge is about the simplest operation we can perform. It creates no new objects, instead simply updating the current branch to reference a different commit.
$ git merge --ff-only feature
A traditional merge is much more interesting. We start with two diverging histories, and Git creates a new tree for us from the two existing trees.
Once it has the new tree, Git will create a new commit that points at this tree. Lastly, the branch ref will be updated to point at this new commit.
Comparing these two merge strategies, it becomes clear why we prefer the fast-forward only merges. In a fast-forward merge we are just updating a pointer, but the code is not changed. In a traditional merge, Git does its best to bring together two different versions of the code, creating a new commit and tree that we have not interacted with.
$ git merge feature
So with this comparison of traditional and fast-forward merges in mind, we can
talk about our good friend
rebase. Rebase can be performed when we have new
commits on both our feature branch, and our "upstream" branch (typically
master). We want to update the commits on our branch so they include the
When we rebase, we essentially replay our work on the current version of the upstream branch. Git does this by calculating each of the diffs for the commits unique to our branch, then applies them onto the upstream branch one by one. Each application of a diff creates a new commit, reusing the associated commit message and author details.
Note that the old commits still exist, but they are now orphaned. No refs point to them any longer and so they are essentially unreachable, although we know from the discussion of the reflog in the first video that we could easily restore them by checking the reflog.
Once all the new commits have been created, our branch is updated to point at the tip commit of our rebased group.
From here, we could now fast-forward merge the master branch into ours, as we are now in line with its history. The key difference between this and a traditional merge is that all of the commits here were created by us, and we get to interact with them and test them as needed before merging them into master.
$ git rebase master
Interactive rebase is very similar. We begin with a set of commits, typically on a feature branch and ahead of master, and we perform our interactive rebase. When we squash them down, we create a new commit using the tree of our former tip commit, and compose a new commit message.
Once again, we can see that the old commits live on despite being orphaned, and we can therefore get back to them as needed.
$ git rebase --interactve master