Git tips and tricks

Last modified : 14 July, 2017

Git rebase Gotcha

Or how I suffered due to my ignorance about git-rebase.

Pop Quiz! Imagine the following scenario. You have a fork of hadoop called branch-2.7.2-ravi which was branched off of Apache rel/release-2.7.2

                      F -> G -> H (raviprak/branch-2.7.2-ravi)
                     /
A -> B -> C -> D -> E (apache/rel/release-2.7.2)
            \
             I -> J -> K (apache/branch-2)

Time comes around that the world starts asking, “When will your branch have the latest, the greatest, the awesomest version of Hadoop? Version 2.7.3?” You grumble and mumble and then to save time on individually cherry-picking commits F, G and H (and about 54 others) onto a new branch-2.7.3-ravi (branched from apache/rel/release-2.7.3), you figure, WTH, I’ll just use git-rebase. This magical looking command. Oh thank God for small mercies. Yes sireeeee. Let’s use this git-rebase.

git checkout rel/release-2.7.3 -b branch-2.7.3-ravi git rebase branch-2.7.2-ravi

You work through the inevitable conflicts as git rebase does Its Thing (tm). Eventually you finish fixing the conflicts and voila! You have a new history

                                                                                 (raviprak/branch-2.7.3-ravi)
                                                                               /
A -> B -> C -> I -> J -> K -> D -> E -> F -> G -> H.......-> P -> Q -> R -> S -> T (apache/rel/release-2.7.3)

You push the new release out thinking all is well with the world, and life is full of roses.

A few months later you get a bug that was fixed already. What? :-O WTH??? This is YARN-4354. This is supposed to be in branch-2.7.3-ravi. WHAT GIVES?

That’s when you realize, git-reset is a deal from the devil. Oh the agony! The pain! How can I even explain?

Remember the new history git-rebase generated for you? Here’s an excerpt from the man page

git-rebase - Reapply commits on top of another base tip ….. If the upstream branch already contains a change you have made (e.g., because you mailed a patch which was applied upstream), then that commit will be skipped. …..

So if

commit C was "Commit YARN-4354",
commit Q was "Revert YARN-4354"
commit R was "Commit YARN-4354"

What happens? Why R gets skipped. Because C was already in!!

Git rebase subtlety

Here’s something I realized today. Let’s say you have the following branches

A -> B -> C -> D : master
						\-> E -> F -> G : branch-A 
												\-> H -> I -> J : branch-B

Let’s say you open a PR for merging branch-A into master. You get some reviews and you modify G to M (git commit –amend). The history for each branch now looks like:

A -> B -> C -> D : master
A -> B -> C -> D -> E -> F -> M : branch-A 
A -> B -> C -> D -> E -> F -> G -> H -> I -> J : branch-B

Now branch-A gets merged into master (Here N is the merge commit)

A -> B -> C -> D -> E -> F -> M -> N : master 
A -> B -> C -> D -> E -> F -> G -> H -> I -> J : branch-B

Now lets say I want to merge branch-B into master too. What I typically do is rebase branch-B on master. But at this point there may be a conflict when git tries to apply G on master. To resolve the conflict, if you make exactly the same changes as were once made for converting G-> M as part of the PR for branch-A, branch-B’s history gets rewritten. This is great because now the merge into master will just add the O, P and Q commits.

A -> B -> C -> D -> E -> F -> M -> N : master 
A -> B -> C -> D -> E -> F -> M -> O -> P -> Q : branch-B

However during rebasing if there is even a characters’ difference when you resolve the conflict, you get:

A -> B -> C -> D -> E -> F -> M -> N : master 
A -> B -> C -> D -> E -> F -> M -> N -> S -> T -> U -> V  : branch-B

Here S is the difference between G and M and T = H, U = I, V = J

All content on this website is licensed as Creative Commons-Attribution-ShareAlike 4.0 License. Opinions expressed are solely my own.