Managing Git Branches of Branches
Rebasing can get tricky!
git rebase --onto is good to be aware of, especially if you have long running feature branches and you use Github’s “squash and merge” feature.
Short story longer…
The problem revolves around keeping git branches up to date or in sync with main/master, or dev, or whatever you consider to be your trunk. The two basic tools git has for syncing up branches are
git merge and
git rebase. The other big contributor to this problem is a trend towards using “squash merges”.
Github pull requests (PRs) often contain multiple commits, but one of Github’s settings is to squash these into one commit at the point of merging it to the trunk.
If your PR has conflicts, touches the same files, or just has some interplay with new commits that landed in the truck since you opened it you may want to or need to sync up with the trunk prior to merging.
If you use
git merge [trunk] on an branch an issue is that it often makes rebases later really painful to accomplish, and multiple merges from your trunk often get to be a problem for long running branches - a thing that is prone to happen for certain bigger features or epics.
The problem with
git rebase [trunk] on an long lived branch is it requires extra communication to collaborators, and especially that it causes headaches for branches off of that branch.
There are sort of two basic solutions:
- don’t use squash merges to mainline branches. this makes it easier for git to determine common ancestors in the commit tree or
- get really good at rebasing and comfortable with force pushes to open feature branches and use this instead of git merge to get feature branches synced up.
Solution 1 has the drawback of a much less clean/concise linear history in mainline branches. and branches of branches can still be a really difficult thing to untangle if there are conflicts, but generally avoids psuedo conflicts based on rewritten histories. Solution 2 has the drawback that rebase is just kinda hard, and modifying commit history on branches that are being collaborated by multiple parties requires extra communication - thus it can be a bit error prone.
If going with solution #1 you can stop reading here. That isn’t to say it is the better solution, but just not what I am trying to address. Our team is sticking with squash merges for now, and we have seem to have more and more long lived branches as our team grows.
The problem and solution workflow is a bit simpler to demonstrate when talking about simpler feature branches, but the the rebase idea applies very similarly to epic branches and features that would merge to those instead of “dev” or whatever you name your mainline trunk.
git checkout -b feature_branch
At this point the feature branch is up for review, but other features that are dependent on
feature_branch are in queue. So while
feature_branch is in review a new branch can be created from it work can go forward.
git checkout feature_branch
Meanwhile, feedback or issues have been found on
feature_branch and new commits need to be made in response. No problem, right!
git checkout feature_branch
So far so good. Smooth sailing. However, choppy seas are coming. Once
feature_branch is ready to merge to a main line a problem starts to emerge.
NOTE: for context if the squash merge feature were a manual process this is what it would look something like.
git checkout dev
Built on shifting sand?
Even now the problem may not be obvious. On the trunk branch (
dev) everything is fine. A nice and tidy commit history even! But what about
dependent_branch. It’s history has a bunch of commits (from feature_brach) that already exist as a squashed commit on dev. How do we get those now redundant, and conflicting as far as Git can tell, change sets out of the way? If we try a simple rebase of
dependent_branch onto dev or try and merge dev into it, Git will be confused by the re-written/squashed history and basically it will consider all those change in
dependent_branch as a conflict. It’s a tedious and error prone process to go through and basically re-do or de-conflicticize all of the changes in
A workaround that can sometimes be fine
Git branches are cheap, so one sort of simple brute-force work around approach is just to abandon
dependent_branch. Manually create a patch and apply it off a fresh branch created from the up to date main line branch
git checkout dependent_branch
Or, same idea but using
merge --squash instead of
git checkout dev
But not without some pitfalls of its own
The main problems with these approaches are, if there’s any actual conflicts, it can be even worse to fix than the rebase or merge approaches, and on top of that your Git history on
dependent_branch is basically lost.
Of course this whole problem happens because of the re-written history during the squash of a
feature_brach to a main line branch, but anyway, what’s the best solution / workaround? If your instinct tells you that “Git is full of magic tricks. surely there is something to be done?” your instinct would be correct!
Clarifying where the conflicts arise
First, I think it really helps to have a solid understanding about why this happens:
O1 be “original” mail line branch - dev, master, main, whatever it’s name.
O2 will be “updated original”, for example dev after a feature branch has been squash merged into it:
Say feature_branch looks like:
dependent_feature has a few extra commits on top of that:
Github merges your
feature_branch into dev squashing it down in the process, and in our configuration also pruning it (in the remote repository) giving you:
O - O2
dependent_feature were on the remote repo and were to have a PR opened to merge to dev, let’s say, it’s would look sort of like this:
O - O2
Github would flag it as conflicting and maybe it’s easier to see why at this point. of course that old
rebase command is tempting, and seems like it would be a solution, but there is a catch. If you were to try to rebase
dev, Git is going to try to figure out the common ancestor between those branches. While it originally would have been
C, if you had not squashed the commits down, Git instead finds
O as the common ancestor. As a result, Git is trying to replay
C which are already contained in
O2, and you’re going to get a bunch of conflicts.
For this reason, you can’t rely on a simple unadorned rebase command. Fortunately Git does have something to help. We’re still are going to use rebase, but we’re going to have to be more explicit about how we want the rebase to proceed by supplying the
git rebase --onto dev HEAD~3
O - O2
HEAD~3 parameter as necessary for your branches, and you shouldn’t have to deal with any redundant conflict resolution.
A slightly better (for me) solution
Some alternate syntax, if you don’t like specifying ranges, and given you probably haven’t deleted your unsquashed local copy of
feature_branch yet you can do:
git rebase --onto dev feature_branch dependent_feature
Standard rebase issues apply
Chances are that will go smoothly, but if not, conflicts should be fairly manageable in the sense they will be actual conflicts, not pseudo conflicts that Git just thinks it needs your help with. And, of course, if you had previously pushed
dependent_feature to a remote repository you will need to force the new history up there which is standard procedure for a rebase.
git push -f origin dependent_feature
and you might need to communicate to others who would need to use that branch that they need to get the new version of that branch and that a standard
git pull won’t do. They need to use
git reset --hard origin/dependent_feature