Inside Git: How It Works and the Role of the .git Folder

How Git Works Internally

Have you wondered what actually happens when you run git init for the first time in your working directory and suddenly all your files are now getting tracked. Let’s break it down:

When you run the command git init the first thing that happens is that Git creates a new folder called .git inside your working directory. This hidden directory is the main repository that allows Git to track changes. Without it, Git cannot do anything
To view whether this repo really exists or not you can type ls -a which will list all files including hidden ones

Never ever modify or delete the .git repository, otherwise Git will fail to track changes properly

Understanding the `.git` Repository

If we open the .git repo and take a look, this is what it would usually look like:

.git/
├── HEAD
├── config
├── index
├── objects/
├── refs/
│   ├── heads/
│   └── tags/
└── logs/

There are 6 separate files/folders as you can see. Each part has their own roles to play in order to make Git work seamlessly

In Git, your source code lives in two places:

Working Directory – files you edit
Repository (.git) – Git’s memory and logic

Everything Git knows — history, branches, commits, tags — lives only inside .git.

Git Internals

1. `HEAD` — Where You Are Right Now

HEAD answers one question:

What is the current point in history?

Usually it contains:

ref: refs/heads/main

Meaning:

You are on branch main
HEAD points to a branch
The branch points to a commit

2. `refs/` — Names for Commits

The refs directory stores human-friendly names for commit hashes.

refs/heads/main
refs/heads/feature-x
refs/tags/v1.0

Each ref file contains exactly one commit ID.

Branches are not folders of commits — they are movable labels.

3. `objects/` — Git’s Database

This is where Git stores everything permanently:

commits
trees (directories)
blobs (file contents)
annotated tags

Objects are:

immutable
content-addressed
identified by a hash

Git doesn’t track files.
It tracks snapshots.

4. `index` — The Staging Area

The index (staging area) is where Git prepares the next commit.

Working Directory → Index → Commit

This is why git add exists — it’s an explicit selection step.

5. `config` — Repository Settings

Contains repository-specific configuration:

remotes
branch behavior
hooks configuration

Overrides global Git config when needed.

6. `logs/` — The Safety Net

Tracks how refs move over time.

Used by:

git reflog

This is why Git can recover from:

hard resets
rebases - Integrate changes from one branch into another
detached HEAD mistakes

Git remembers even the things you wish you hadn’t done.

Git Objects: Blob, Tree, Commit

In Git, everything stored permanently is an object, identified by a hash

Blob – stores file contents only (no filename, no permissions)
Tree – stores directory structure (names → blobs/trees)
Commit – stores a snapshot reference (points to a tree + parent commit)

Blob = data, Tree = structure, Commit = history.

Git Command Internals

Let’s look at the internal working of two important Git commands :

git add
git commit

`git add`

In Git, git add does not create a commit.
It prepares data for the next commit by updating the index (staging area).

What actually happens:

Reads the file from the working directory
Git takes the current file contents exactly as they are.
Creates a Blob object (if needed)
- The file contents are hashed
- Stored in .git/objects/ as a blob
- If an identical blob already exists, Git reuses it
Updates the Index (staging area)
- Records:
  - blob hash
  - file path
  - file mode (permissions)
- This snapshot represents what the next commit will look like
Nothing else moves
- No commit is created
- No branch is updated
- HEAD does not change

git add writes content to the index by hashing files into blobs and recording them as the next snapshot.

The key data flow

Working Directory
      │
      │  git add
      ▼
Index (staging area)
      │
      │  git commit
      ▼
Commit (objects database)

Internals of Git add - AI generated

`git commit`

In Git, git commit turns the staged snapshot into permanent history. It doesn’t read your working files directly—it uses the index.

What actually happens:

Reads the Index (staging area)
Git takes the exact snapshot you staged with git add.
Builds Tree object(s)
- Creates a root tree representing directories
- Trees reference blobs (files) and subtrees (folders)
Creates a Commit object
The commit stores:
- pointer to the root tree
- parent commit hash(es)
- author & committer
- timestamp
- commit message
Writes objects to .git/objects/
Trees and the commit are stored as immutable, content-addressed objects.
Moves the branch ref forward
- The current branch (that HEAD points to) now points to the new commit
- HEAD itself doesn’t change
Updates reflogs
Enables recovery via git reflog.

The data flow

Index (staged snapshot)
        │
        ▼
     Tree(s)
        │
        ▼
     Commit
        │
        ▼
refs/heads/<branch>

git commit converts the staged snapshot into a tree, wraps it in a commit, and moves the current branch to point at it.

Git Integrity

In Git, hashes are the foundation of trust. Git doesn’t assume data is correct—it proves it.

1. Content-addressed storage (the core idea)

Every object in Git (blob, tree, commit) is stored by hash of its contents.

Change the content → hash changes
Same content → same hash (deduplication)

This means:

The name of the data is derived from the data itself.

2. What exactly is hashed?

Git computes a hash over:

<object-type> <size>\0<content>

So, the hash uniquely represents:

the type (blob/tree/commit)
the exact bytes
the exact size

Any single-bit change produces a completely different hash.

3. Integrity through chaining (why history is tamper-proof)

Blobs store file contents
Trees store names → blob/tree hashes
Commits store:
- tree hash
- parent commit hash(es)
- metadata

This creates a hash chain:

Commit
  ↓ (tree hash)
Tree
  ↓ (blob hashes)
Blob

And between commits:

Commit C3 → parent C2 → parent C1

If you alter anything in the past:

that object’s hash changes
every dependent hash breaks
Git detects the inconsistency

4. Corruption detection (automatic verification)

Git can verify integrity using:

git fsck

It:

recomputes hashes
checks object references
reports corruption immediately

This works even after:

disk errors
interrupted writes
manual tampering

5. Why this design is powerful

Immutability: objects never change once written
Trust: history cannot be silently altered
Efficiency: identical content stored once
Security: corruption is detectable, not hidden

Git doesn’t track files — it tracks proof of content.

This is how Git operates internally and is the reason why it has become highly adopted by the industry.

Inside Git: How It Works and the Role of the .git Folder

How Git Works Internally

Understanding the `.git` Repository

Git Internals

1. `HEAD` — Where You Are Right Now

2. `refs/` — Names for Commits

3. `objects/` — Git’s Database

4. `index` — The Staging Area

5. `config` — Repository Settings

6. `logs/` — The Safety Net

Git Objects: Blob, Tree, Commit

Git Command Internals

`git add`

What actually happens:

The key data flow

`git commit`

What actually happens:

The data flow

Git Integrity

1. Content-addressed storage (the core idea)

2. What exactly is hashed?

3. Integrity through chaining (why history is tamper-proof)

4. Corruption detection (automatic verification)

5. Why this design is powerful

Comments

More from this blog

Emmet for HTML: A Beginner’s Guide to Writing Faster Markup

CSS Selectors 101: Targeting Elements with Precision

How a Browser Works: A Beginner-Friendly Guide to Browser Internals

TCP Working: 3-Way Handshake & Reliable Communication

TCP vs UDP: When to Use What, and How TCP Relates to HTTP

Command Palette

How Git Works Internally

Understanding the .git Repository

Git Internals

1. HEAD — Where You Are Right Now

2. refs/ — Names for Commits

3. objects/ — Git’s Database

4. index — The Staging Area

5. config — Repository Settings

6. logs/ — The Safety Net

Git Objects: Blob, Tree, Commit

Git Command Internals

git add

What actually happens:

The key data flow

git commit

What actually happens:

The data flow

Git Integrity

1. Content-addressed storage (the core idea)

2. What exactly is hashed?

3. Integrity through chaining (why history is tamper-proof)

4. Corruption detection (automatic verification)

5. Why this design is powerful

Comments

More from this blog

Understanding the `.git` Repository

1. `HEAD` — Where You Are Right Now

2. `refs/` — Names for Commits

3. `objects/` — Git’s Database

4. `index` — The Staging Area

5. `config` — Repository Settings

6. `logs/` — The Safety Net

`git add`

`git commit`