What is Going On in the .git Directory?
Git is a free, open source version control system (VCS) that is used by the vast majority of web developers to manage changes to source code, especially on projects being worked on by multiple programmers. Most developers know the basics of how to stage changes, commit them, push them to a remote repository, fix conflicts, and revert back to previous versions. However a much smaller percentage actually understand how Git works under the hood.
The complete inner workings of Git are too complex to get into in one blog post, but at the heart of Git lies the .git directory, which contains all the information necessary to manage your repository. Examining this directory will give us a solid, high level overview of how it works.
Creating a new Git repository
Let's start by creating a new directory and initializing a new Git repository to witness things from the very beginning.
$ mkdir ~/Code/gitdemo
$ cd ~/Code/gitdemo
$ git init
Initialized empty Git repository in /Users/danielsellergren/Code/gitdemo/.git/
Now let's see what we just created.
$ find .
That is a lot of stuff for an empty project! Let's go through the main files and directories one by one and see if we can get an understanding of what's going on.
What is in the
This is the main "object store" for the repository. Objects are the heart of a Git repository, containing all the original data, log messages, and directory structure. The four types of Git Objects are
Blobs are "binary large objects" which is a term used to refer to any file regardless of its internal structure. Git blobs do not care about the file's name or what kind of file it is, it simply stores its contents. Every source code file, image, or other file is a
blob according to Git.
Trees store information about a single directory. This includes the
blob identifiers (more on that later), the actual path names, and some basic metadata.
Commits are metadata about every change made to the repository. For example the author, committer, dates, and descriptions. Each commit points directly to a tree which contains the state at the moment the commit was created.
Tags are human-readable names for specific objects, generally a commit. They are used most often for identifying specific releases or versions but have other uses as well.
objects directory also contains two subdirectories. The
pack subdirectory contains compressed files for network transfer and the
info subdirectory with just some additional metadata.
Contains references to the current state of your branches and tags. Each reference is stored as a file that contains the 40-character SHA-1 identifier of the corresponding commit. The
heads subdirectory contains references to the tip of your branches, and the
tags subdirectory contains references to your tags.
The HEAD file is a symbolic reference to the current branch. It points to the branch reference in the
refs/heads directory, which in turn points to the commit that represents the tip of the branch.
Contains scripts that are executed at various points in the Git workflow. For example, you can use hooks to automatically run tests before pushing changes to the remote repository.
Other files and directories
- `.git/config` is a file containing repository-specific configuration options for example remote URLs to places like GitHub and default brach information.
- `.git/description` is an optional file that provides a human-readable description of your repository.
- `.git/info` stores some additional repository information such as the `exclude` file which is a project-wide list of files that should be excluded but only from high-level (AKA "Porcelain") commands such as `git status` or `git add`.
.git directory is the backbone of Git and is essential to the functioning of your repository. Understanding its structure and functionality is key to being able to effectively use Git for version control. The better you understand how this directory works, the more you'll understand what you're doing while managing your version control.