GIT-op 2: Rebase

October 15th, 2009
by squidgit

Rebase is generally used to move a set of changes started against one base to another. 99% of the time, you’re developing something against an upstream branch, that branch moves and you want to get the latest and greatest. You can either merge upstream with you and loose your clean history, or rebase. Rebase has grown to be a fairly general way to change history though; you can use it to reorder, change and drop commits and move your development not just further up the same branch but across to somewhere else entirely.

Before I get going, please have a read of the last entry which has some tips about working with rebase; the biggest of these is simply Keep Rebased Trees Private.

GIT tree storage

A lot of rebase semantics make more sense if you realize that a GIT commit ID doesn’t really refer to a commit but rather to the state of your source tree immediately after that commit.  Many people expect

git diff abcdef1234

to give a unified diff of the changes introduced by that commit, but that commit ID, as above, doesn’t represent a commit as such but a tree.  What’s the correct output when you ask for the diff of a tree?  It just doesn’t make sense.

Branch names really just represent a specific commit ID, the ID of the most recent commit on that branch.  A branch name has no internal concept of where it’s currently based; this is one of the most common errors when rebasing – telling GIT the branch to operate on is based on the wrong commit.

Rebase

As I mentioned above, the most common use of rebase is simply to shift your commits further up an upstream branch.  If you’re following my advice about tracking upstream in the “master” branch and doing work somewhere else (call it “my_branch”) then this is accomplished by

git rebase master my_branch

or just

git rebase master

if you’ve got my_branch checked out.  Because GIT only deals with commits, you have to have committed current work before you run.

But remember branch names are really just commit IDs, so this is read by GIT as “take all the commits in my_branch which aren’t in master and move them to the current HEAD of master”; or else “squish all the master history in before any my_branch history”.

Rebase interactive

This was described in the previous entry.  By appending the –interactive flag to the rebase (but otherwise using the same syntax) you get an opportunity to change the commits in my_branch as they get moved to the destination.  Note that the destination head can actually be exactly where my_branch is already based.  While this makes the rebase a no-op usually, in interactive mode you can use it to change commit history at any time.

onto

The –onto switch allows you to move a set of commits from one branch of your repo to somewhere completely different.  Say you’re developing feature_2 which is based on feature_1 but feature_1 is not yet in the upstream branch.  After a bit of a brainwave you realize feature_2 can be rearranged to not have that dependency and is ready to move upstream by itself.  Do

git rebase --onto master feature_1 feature_2

This is read by GIT as “take all the history from feature_2 which isn’t in feature_1 and stick it on the end of master”.  Because the –onto switch has to come before the other 2 branch names the syntax seems backwards; just remember that the last 2 branch IDs are the same no matter what version of the rebase command you’re using (just “base” and “head” branches respectively).

Another feature of –onto rebasing is an alternative way to drop commits from the middle of history.  Suppose you’re on branch “my_branch” and the last 5 commits have names “commit1″ to “commit5″ (I’ll be writing an entry soon on the best ways to specify actual revisions).  Then

git rebase --onto commit2 commit4 my_branch

Will take the stretch of commits between commit4 and the branch head and move them to be based at commit2 – i.e. you’ve just dropped commit3 from your tree.  (note the sloppy naming here, as I’ve been at pains to point out commit names identify trees, not commits, so actually calling them commitN isn’t ideal :-) )

When all goes wrong

During a rebase, you may of course get a number of conflicts between your code and the tree on to which you’re trying to move it.  All such conflicts will be marked with standard merge markers (“<<<< >>>>”) which you can grep for, use git diff to see or just read the error log.  Once you’ve got rid of all these sites, mark them as fixed by

git add my-fixed-file.c

but instead of committing, run

git rebase --continue

If you’re stuck and want to get out of there, run

git rebase --abort

to undo all rebasing actions.  Finally, if a commit is causing conflicts because it’s no longer needed,

git rebase --skip

will skip that particular commit.  Note that this will loose that commit, be careful!

Finally, git rebase is smart enough to recognize commits which introduce the same changes as each other but have different descriptions and skip them.  That is, if you’ve got a commit in the tree you’re rebasing but has already been accepted in to the new base, git will automatically skip that patch.

Cool, huh?!  That’s it for now, next time I’ll show you the fastest and coolest ways to specify the commits you care about.

Posted in Uncategorized | Comments (0)