Skip to main content

Command Palette

Search for a command to run...

Inside Git: How It Works and the Role of the .git Folder

Git is not magic, unfolding the mystery inside .git folder

Updated
6 min read
Inside Git: How It Works and the Role of the .git Folder
K

I am a problem solver and a full-stack developer. I love to write and share my knowledge. My wish is to learn and help others learn at the same time. Hoping to make a difference in the tech world.

How Git Works Internally

Have you wondered what actually happens when you run git init for the first time in your working directory and suddenly all your files are now getting tracked. Let’s break it down:

  • When you run the command git init the first thing that happens is that Git creates a new folder called .git inside your working directory. This hidden directory is the main repository that allows Git to track changes. Without it, Git cannot do anything

  • To view whether this repo really exists or not you can type ls -a which will list all files including hidden ones

Never ever modify or delete the .git repository, otherwise Git will fail to track changes properly

Understanding the .git Repository

If we open the .git repo and take a look, this is what it would usually look like:

.git/
├── HEAD
├── config
├── index
├── objects/
├── refs/
│   ├── heads/
│   └── tags/
└── logs/

There are 6 separate files/folders as you can see. Each part has their own roles to play in order to make Git work seamlessly

In Git, your source code lives in two places:

  1. Working Directory – files you edit

  2. Repository (.git) – Git’s memory and logic

Everything Git knows — history, branches, commits, tags — lives only inside .git.

Git Internals

1. HEAD — Where You Are Right Now

HEAD answers one question:

What is the current point in history?

Usually it contains:

ref: refs/heads/main

Meaning:

  • You are on branch main

  • HEAD points to a branch

  • The branch points to a commit


2. refs/ — Names for Commits

The refs directory stores human-friendly names for commit hashes.

refs/heads/main
refs/heads/feature-x
refs/tags/v1.0

Each ref file contains exactly one commit ID.

Branches are not folders of commits — they are movable labels.


3. objects/ — Git’s Database

This is where Git stores everything permanently:

  • commits

  • trees (directories)

  • blobs (file contents)

  • annotated tags

Objects are:

  • immutable

  • content-addressed

  • identified by a hash

Git doesn’t track files.
It tracks snapshots.


4. index — The Staging Area

The index (staging area) is where Git prepares the next commit.

Working Directory → Index → Commit

This is why git add exists — it’s an explicit selection step.

5. config — Repository Settings

Contains repository-specific configuration:

  • remotes

  • branch behavior

  • hooks configuration

Overrides global Git config when needed.


6. logs/ — The Safety Net

Tracks how refs move over time.

Used by:

git reflog

This is why Git can recover from:

  • hard resets

  • rebases - Integrate changes from one branch into another

  • detached HEAD mistakes

Git remembers even the things you wish you hadn’t done.

Git Objects: Blob, Tree, Commit

In Git, everything stored permanently is an object, identified by a hash

  • Blob – stores file contents only (no filename, no permissions)

  • Tree – stores directory structure (names → blobs/trees)

  • Commit – stores a snapshot reference (points to a tree + parent commit)

Blob = data, Tree = structure, Commit = history.

Git Command Internals

Let’s look at the internal working of two important Git commands :

  • git add

  • git commit

git add

In Git, git add does not create a commit.
It prepares data for the next commit by updating the index (staging area).

What actually happens:

  1. Reads the file from the working directory
    Git takes the current file contents exactly as they are.

  2. Creates a Blob object (if needed)

    • The file contents are hashed

    • Stored in .git/objects/ as a blob

    • If an identical blob already exists, Git reuses it

  3. Updates the Index (staging area)

    • Records:

      • blob hash

      • file path

      • file mode (permissions)

    • This snapshot represents what the next commit will look like

  4. Nothing else moves

    • No commit is created

    • No branch is updated

    • HEAD does not change

git add writes content to the index by hashing files into blobs and recording them as the next snapshot.


The key data flow

Working Directory
      │
      │  git add
      ▼
Index (staging area)
      │
      │  git commit
      ▼
Commit (objects database)

Internals of Git add - AI generated

git commit

In Git, git commit turns the staged snapshot into permanent history. It doesn’t read your working files directly—it uses the index.

What actually happens:

  1. Reads the Index (staging area)
    Git takes the exact snapshot you staged with git add.

  2. Builds Tree object(s)

    • Creates a root tree representing directories

    • Trees reference blobs (files) and subtrees (folders)

  3. Creates a Commit object
    The commit stores:

    • pointer to the root tree

    • parent commit hash(es)

    • author & committer

    • timestamp

    • commit message

  4. Writes objects to .git/objects/
    Trees and the commit are stored as immutable, content-addressed objects.

  5. Moves the branch ref forward

    • The current branch (that HEAD points to) now points to the new commit

    • HEAD itself doesn’t change

  6. Updates reflogs
    Enables recovery via git reflog.


The data flow

Index (staged snapshot)
        │
        ▼
     Tree(s)
        │
        ▼
     Commit
        │
        ▼
refs/heads/<branch>

git commit converts the staged snapshot into a tree, wraps it in a commit, and moves the current branch to point at it.

Git Integrity

In Git, hashes are the foundation of trust. Git doesn’t assume data is correct—it proves it.

1. Content-addressed storage (the core idea)

Every object in Git (blob, tree, commit) is stored by hash of its contents.

  • Change the content → hash changes

  • Same content → same hash (deduplication)

This means:

The name of the data is derived from the data itself.

2. What exactly is hashed?

Git computes a hash over:

<object-type> <size>\0<content>

So, the hash uniquely represents:

  • the type (blob/tree/commit)

  • the exact bytes

  • the exact size

Any single-bit change produces a completely different hash.


3. Integrity through chaining (why history is tamper-proof)

  • Blobs store file contents

  • Trees store names → blob/tree hashes

  • Commits store:

    • tree hash

    • parent commit hash(es)

    • metadata

This creates a hash chain:

Commit
  ↓ (tree hash)
Tree
  ↓ (blob hashes)
Blob

And between commits:

Commit C3 → parent C2 → parent C1

If you alter anything in the past:

  • that object’s hash changes

  • every dependent hash breaks

  • Git detects the inconsistency


4. Corruption detection (automatic verification)

Git can verify integrity using:

git fsck

It:

  • recomputes hashes

  • checks object references

  • reports corruption immediately

This works even after:

  • disk errors

  • interrupted writes

  • manual tampering


5. Why this design is powerful

  • Immutability: objects never change once written

  • Trust: history cannot be silently altered

  • Efficiency: identical content stored once

  • Security: corruption is detectable, not hidden

Git doesn’t track files — it tracks proof of content.

This is how Git operates internally and is the reason why it has become highly adopted by the industry.