Code Reviews and Bad Habits

rbrown46 · on Feb 20, 2014

We've been using Opera Critic (https://github.com/jensl/critic) where I work for about half a year. We did trial runs with Gerrit, Barkeep and Phabricator before settling on it. My impression is that its almost completely unknown outside of Opera, which is surprising given the feature set, clean UI and active development. The site used to review branches for the Critic project is also a demo: https://critic-review.org

It supports the model of having many atomic, reviewable commits on a topic branch. Where it seems to exceed Kiln is in the rebase/fixup workflow. You can push fixup commits in response to issues left by a reviewer. A fixup commit will automatically mark an issue it addresses as resolved when you push based on the lines it changes (the fixup must also be reviewed). You can later rebase the branch to integrate the fixups. It's pretty clever, and the way fixups are shown lets you see at a glance how the branch has evolved.

As a reviewer, and as someone who often digs through five year old commits, I want feature branches to follow the most linear path possible. Approaches that are started and abandoned, and commits made on top of a branch in response to review feedback are noise. A well-crafted branch often has little to do with what happened during development.

jrockway · on Feb 20, 2014

I disagree with this; you make one patch for your reviewer to review. If you want to do two things, make two patches.

The history of how you made that patch should be irrelevant; if you care, your patch is too big. Split it up.

I will admit that this takes some getting used to. When I use git alone, I commit for every little thing I change. But that's not actually useful for other people to look at; what people really want are high-level patches that represent logical changes, not just physical edits of the source code.

I've also gotten myself into a lot of trouble by committing a lot of things and then splitting them up for review. This always takes a very long time.

So just focus on one thing at a time, send it for review, and make sure your coworkers believe that reviews are the most important thing they can be doing. Then you get the small commits you like, but also get a more readable history, and better feedback from reviewers.

js2 · on Feb 20, 2014

I couldn't disagree more.

First of all, Gerrit groups the related commits together in two ways:

1) As a list of dependencies: "This commit depends on; this commit is depended on by."

2) If you develop your feature on a topic branch (and you should), when you push the commits to gerrit you can (and should) specify the topic. The related changes are then grouped by topic. ("git push origin HEAD:refs/for/master/topic_name")

Second, the reason this workflow is superior is that it requires each commit to be individually reviewed (and hopefully verified), and what ends up getting merged is a clean topic branch, with no bad commits.

Contrast this to the other workflow, and I'll use GitHub's Pull Request as the example. Typically what's reviewed is not the individual commits on a topic branch, but rather just the diff between the tip of that branch and the merge base between it and where it's to be merged. In this workflow the individual commits are often not even looked at. So you can have this series of commits on a topic branch:

   1. Commit A
   2. Fixup to commit A
   3. Commit B
   4. Commit C
   5. Fixup to commit B

When reviewing that in GitHub, you'd typically just review the total diff introduced by all 5 commits, and if it looks good, you merge all 5. Also, typically the CI system builds just 5 or the result of 5 merged into its destination branch. So now your history has the two fixup commits in it that don't belong there.

With the Gerrit-style workflow the fixups never enter the main branch, since instead you'd have squashed them into their respective commits. You'd also hopefully configure your CI system to verify each commit individually.

Further, on a decently sized topic, I find it easier to review a series of smaller changes, vs a huge diff.

It should be noted that this style of reviewing individual diffs is inspired by how reviews are done over mailing lists (Linux, Git, etc).

That said, once all the commits on a topic have been reviewed and perfected, it is often nice to be able to see the final diff of what is to be merged. With Gerrit you'd have to do that externally (via gitweb or by downloading the topic branch and diffing it locally against its destination).

Have this as a feature in Gerrit proper has been discussed and is somewhat contentious among its developers - https://groups.google.com/forum/#!topic/repo-discuss/FBJl5wW...

npsimons · on Feb 20, 2014

Yes, precisely; I got the eery feeling that TFA was yet another rant against rebase, and had it more or less confirmed by the end.

The question ultimately becomes, what do you care about more: recording every little change that's ever made, so you have a record of every time someone sneezed, or do you want a more coherent narrative that can easily be bisected? To my mind, half the reason to have code reviews and continuous integration is to make sure that the project history is coherent and bisectable.

Some might argue that non-pristine patches shouldn't be pushed in the first place, but let's be honest: we're all human, we all make mistakes, and this is (again) why we have code reviews and CI.

I can understand having options, and I'll admit that Gerrit seems good enough to me, mostly because that's all I've used. But I'll counter that successive patches are recorded in the history of Gerrit, along with commentary and the back and forth. I can see the pros to having smaller isolated commits, but one best practice recommends writing the tests first (and having them fail), then write the feature; in this workflow, you would not want to push failing tests separately because it would break bisect, and the CI won't let it through anyway.

koko775 · on Feb 20, 2014

> The whole point of a source control system is to tell you the history of how code got to be the way it was.

No, the point of source control is to record incremental improvements in a codebase. This is a subtle difference, but critical: the history of how codebase got from point A to point B isn't valuable in itself - the logical changes that altered the behavior of the project are.

A series of broken commits might show the developer's journey from zero to 100%, but the idea that each commit is a non-broken incremental addition enables powerful automated workflows for regression testing.

willismichael · on Feb 20, 2014

For what it's worth, Review Board has a way to visualize multiple revisions in a single code review request. I think it is primarily intended for the patch writer and reviewers to go back-and-forth (like "ok, here's my fix based on you comments"), but with some scripting you could make your own tool that uploads a whole series of revisions to Review Board.

chipx86 · on Feb 20, 2014

One of our contributors has recently done exactly this, triggered by a `git push`:

Announcement: https://groups.google.com/d/msg/reviewboard/SVOF-4KZ-VU/lSf0...

Video: http://vimeo.com/86448355

Also, for the record, after we wrap up this 2.0 release, we'll be focusing on allowing users to review branches of commits in a review request, and not just individual squashed patches.

scrabble · on Feb 20, 2014

I wrote a tool we use internally that has that as a feature for Review Board. It was super easy.

I still prefer a more formal code review process since things slip through from people hitting "Ship It" when it's not fully baked.

kdazzle · on Feb 20, 2014

Same with Phabricator

hahahafail · on Feb 20, 2014

100% agree. It's especially hard for teams that are learning both Git and Gerrit at the same time. I both administer and provide support for the Gerrit instance where I work and it has become a huge exercise in babysitting permissions configuration on the server side and dealing with angry users. They get themselves in to lots of "stuck" situations due to a lack of understanding, and it is now possibly building in to a revolt. For the record, I didn't pick Gerrit! :-) Fun times.

jaimebuelta · on Feb 20, 2014

I'm pretty surprised to read this, I've used Rietveld (Gerrit is a fork from Rietveld) and it is entirely possible to review more than one commit at the time (with the option to review them as a unique patch or as individual steps)

But what I love is the option of squashing the commits (or perform any other operation), as the review is independent from the repo, meaning that I can push a clean commit instead of adding a lot of commit fixing issues, which gets a dirty commit tree.

That can be considered as "not showing the full code history", but I prefer to think about it as "only pushing to the repo code that sort of makes sense instead of lots of confusing ver small commits fixing typos"

krallja · on Feb 20, 2014

https://code.google.com/p/gerrit/issues/detail?id=51 seems to imply that gerrit still does not know how to review more than one commit at a time

voidlogic · on Feb 20, 2014

What are the advantage to using Gerret, Kiln, etc for code review? Is it for people who using hg/git/etc without a web interface?

The way I normally do them is you submit your pull request on github or bitbucket and everyone on the team makes suggestions and asks questions until they are satisfied.

Once they are happy they comment "LGTM". Once you have a LGTM from your team members, you pull your pull request.

aiiane · on Feb 20, 2014

Side by side diffs, automatic integration, transactional commenting (queue up a bunch of comments, then publish them with a single notification to the patch writer), enforcement of LGTMs, better visualization of the evolution of a patch, better tracking of resolved comments, etc.

voidlogic · on Feb 20, 2014

Thanks for the info, I'm glad someone responded. Bitbucket has side by side diffs, but I think both those and inline difs work fine (just a matter of taste).

I don't think those other features (for me) justify the complexity of running another application. For example, not enforcing LGTMs is a feature, in am emergency, a team member can get stuff done. But if someone abuses that regularly, they loose pull rights (or are fired).

npsimons · on Feb 20, 2014

One way we use Gerrit is to hook the CI (Jenkins) up to it and make sure that the repo as patched passes a number of tests (static analyses, compilation with full warnings, unit and regression tests, build on multiple platforms, etc). If it doesn't pass the CI tests, Gerrit won't allow it to be merged. We let the machine do the dirty work, so we can focus on the semantics of a change.

mapleoin · on Feb 20, 2014

tl;dr Gerrit only allows you to comment on one commit at a time, not a set of commits.

michaelochurch · on Feb 20, 2014

This sounds to me like a case of a people problem being blamed on tools.

krallja · on Feb 20, 2014

Close, but it's a potential people problem being caused by tools.