Back to Home

Commit Size Distribution

Measure change granularity and commit patterns

Commit Size Distribution

Commit size matters. I used to believe that developers should always aim for small commits, but in practice, the reality is more nuanced. There are situations where large commits are unavoidable, like with major refactorings and formatting-only changes.

In those cases, I usually encourage developers to keep the large refactoring in a single pull request and submit follow-up feature work separately. Mixing refactoring with new behavior tends to obscure intent and complicate reviews.

That said, consistently large commits are often a signal of problems in the development process. Smaller commits typically reflect work that has been broken down into logical units and subtasks, which usually correlates with stronger design thinking and better planning.

At the same time, commits that are too small can lack context and make the overall change harder to understand. Striking the right balance isn't easy.

Why it matters

Small commits are easier to review, test, and understand. Large commits are risky since they touch too many files, mix different changes together, and hide bugs.

When you see a commit with 2,000 lines changed, there is a higher chance that the review was rushed or skipped entirely. By tracking commit sizes, you can spot teams that need help with their workflow, find massive changes that need extra testing, and encourage better development habits.

Cosmic Analogy

Think of your codebase as a star. Small commits are like the subtle, frequent solar flares that constantly dance around the corona. They are usually gentle, predictable, and easy to manage. These micro-flares keep the development process healthy and sustainable, with each change small enough to review, test, and understand.

Large commits, on the other hand, are like coronal mass ejections (CMEs). They're occasional massive bursts that expand outward, touching everything in their path. While CMEs happen naturally in our solar system, too many of them in your codebase can spoil your day. Just as astronomers monitor CME activity to predict space weather, you should track large commits to spot risky deployment windows and teams that might need process improvements.

Cosmic analogy

Commit Size Distribution Visualization

How to Fix

Guidelines & Standards

  • Set team guidelines for maximum commit size (e.g., 200-400 lines for most changes).
  • Encourage atomic commits. I believe each commit should represent one logical change that compiles and passes tests.
  • Separate refactoring commits from feature commits to keep both small and focused.

Workflow Techniques

  • Break large features into smaller, reviewable chunks using feature flags or branch-by-abstraction.
  • Use stacked diffs or dependent PRs to show incremental progress without blocking work.

Automation & Enforcement

  • Establish pre-commit hooks or CI checks that warn when commits exceed size thresholds.
  • Track commit size metrics in team dashboards to create visibility and accountability.

Culture & Feedback

  • Review outliers with developers to understand if large commits are unavoidable or indicate planning issues.
  • Celebrate and recognize developers who consistently maintain small, well-structured commits.