developer blog

Back to Originate.com

Refactoring Git Branches

TL/DR

I describe a technique that allows one to extract commits from a larger Git branch into separate branches. This “git branch refactoring” provides numerous benefits:

  • A fast-track for integrating urgent changes (like refactorings or bug fixes) that were created as part of feature development into the main Git branch, before the feature is merged.

  • Improvements to the quality, efficiency, and fun factor of code reviews, by allowing your team to review and merge individual changes within larger feature branches individually, by different reviewers, at different times.

Feature branches contain many different changes

When code classes or methods do too many things, we refactor them by extracting individual pieces of functionality into separate classes or methods. The same applies to feature branches in Git. Ideally a Git feature branch should also perform just one particular change, be it a bug fix, a refactoring, or adding a new feature.

We all know, however, that while adding the next cool feature to a code base, we come across existing code that needs to be improved along the way. So, in the spirit of Continuous Improvement and Kaizen, or simply because we depend on improvements to happen before we can continue developing our feature, we

  • fix bugs in existing code that we depend on,
  • add missing functionality to existing code that our feature requires,
  • refactor existing code, for example by extracting pieces of it to make them available separately,
  • do good in numerous other ways, for example by cleaning up messes, reducing smells, paying down technical debt, or improving code quality and spec coverage.

As a result of this process, many feature branches end up looking similar to the one depicted here (time flows from the bottom up here, i.e. new commits are added at the top):

bloated feature branch

Our feature branch called kg_feature_1 is cut from the development branch (we follow the terminology of the NVIE model here), which is our main branch that is shared by all developers here. This development branch only contains the framework so far. Our feature branch contains a number of commits for the feature itself (feature 1feature 4), a bug fix for an external code module (“bug fix 1”), and a more generic refactoring that was done in two separate steps (“refactoring 1a” and “refactoring 1b”).

Changes should be reviewed individually

We don’t want to just send this many-headed monster of a feature branch off as a single code review. It contains too many different things! This makes it hard for the reviewer to get an overview of what we are doing here. Its also pretty hard to have meaningful conversations about several different things at the same time. This is just a recipe for confusion, cognitive overload, and missing important details. It also reduces the fun and the power of code reviews.

We also want to review and merge the bug fix and the refactorings into the development branch right away, so that other developers can incorporate them into their work before their work deviates too much from our changes. If we waited with that until the whole feature is done, it will be too late, and we will have to deal with a lot more merge conflicts than necessary.

These different types of changes in different parts of the code base should probably also be reviewed by different people. The bug fix gets reviewed by the author/maintainer of the affected module, the refactoring by the architect or tech lead, and the feature by another team member.

Let’s refactor our Git branch!

Extracting commits into dedicated branches

The change that most urgently needs to get merged into development here is the refactoring, because it might touch code that other people are currently working on. Lets extract it into a separate branch.

1
2
3
4
5
6
# Cut a new branch for the refactoring off the development branch.
$ git checkout -b kg_refactoring development

# Move the refactoring commits into the "kg_refactoring" branch.
$ git cherry-pick [SHA1 of the "refactoring 1a" commit]
$ git cherry-pick [SHA1 of the "refactoring 1b" commit]

Our kg_refactoring branch now looks like this.

refactoring branch

It only contains both refactoring commits. Perfect! Now we can push this branch to the Git repo, get it reviewed by the architect or tech lead, and merge it into the development branch.

Once this is done, we rebase our feature branch against the development branch to pick up all the changes that happened in it recently, like our extracted refactoring. We do this regularly anyways, since it is a good practice to keep our feature branches synchronized with ongoing development, and solve merge conflict as soon as they happen, rather than all at once when the branch is done.

1
2
$ git checkout kg_feature_1
$ git rebase development

If you know Git, you know that our branch situation looks like this now.

after refactoring merge

This looks much better already. The refactorings are now part of the development branch, and our feature branch still has access to them, and uses them!

The only foreign body left in the feature branch is the bug fix. Once we have extracted it using the exact same technique, our branch situation looks like this.

refactoring branch

Both bug fix and refactorings have been individually reviewed and merged into the development branch, separately from the ongoing feature development. The refactored feature branch contains only feature-related commits, and can be reviewed now, or developed further. If we do more bug fixes or refactorings, we can repeat this procedure as needed.

When to do this

Extracting autonomous changes into their own Git feature branches can significantly improve the structure of your branches, the performance of your code reviews, and the efficiency of your workflows. In order to work well, however, it is dependent on a number of things:

  • Each commit is well described and addresses a single change (feature, bug fix, refactoring, etc).

  • You can rebase your feature branches. Only do that with branches that only you use! Read Linus Torvalds’ explanation if you are unsure what this means. In our case we marked our branches as private by prefixing their names with our initials, and according to our team policy this makes them private.

  • You have enough specs, especially feature specs (aka integration tests) to verify that everything still works after your branch refactorings.

All these are good engineering practices to live by anyways, so give this a shot, and let me know what you think in the comments!

Comments