webchick.net

very helpful lioness

Diaries of a Core Maintainer #5: The 6-pass patch review

Mon, 11/10/2008 - 13:47 -- webchick

I would love to see a series of blog posts from Drupal developers on patch review strategies they employ, so we can share some tips and tricks and ramp up our collective review IQs. I'll start it off with mine. I call it "The 6-Pass Patch Review" (wittier names welcome! ;))

To begin, crack open your text editor. You'll use it to jot down notes and questions as you go through each of these passes. Other than that, no fancy tools are required to do most of this other than a keen set of eyes, an inquisitive mind, and the ability to empathize with people new to Drupal and put yourself in their shoes. (Of course, if you are relatively new to Drupal, then patch review can actually come easier for you than other people. :) Start today!)

Pass 1: The "Coder module" pass

In this pass, you basically just become a human version of Coder module. Quickly scan through of the code to see if you can spot anything amiss coding standards-wise. The more patches you read, the more obviously things stick out to you that deviate from the rest of the code. For instance, after 3+ years of doing this every day, my brain is now physically wired so that if I see something outdented two spaces more than it ought to, or a missing period on an inline comment, I break out into a cold sweat. ;)

Probably at some point most of this pass will be completely automated from testing.drupal.org. But for now, it's a really great way to scan over the length of the patch and quickly get the sense of what's there without busting your brain too badly.

Pass 2: The "Bottom up" pass

In this pass, you literally start reading the patch hunks from the bottom to the top and see if they make sense to you. Try to resist the urge to read any comments (that's for the next step).

This is a great way to catch things like some_function(NULL, TRUE, array(), $booger, 'aardvark'). Without benefit of having seen the PHPDoc for some_function() ahead of time, if the calling code makes no sense, we have some work to do. This work could involve renaming some_function() to something that makes it more obvious what it's for, or it could involve refactoring some_function() into several functions that each take parameters specific to their functionality, or it could just involve re-formatting the way options are passed in. At the very least, it means we need some inline comments to explain what it does. Which leads us to...

Pass 3: The "Comments-only" pass

Now that you've read the code from bottom to top, you should hopefully be starting to glean some idea of what this patch is actually doing. Now, your task is to make sure the comments reflect your understanding, and/or clarify places where you are still head-scratching.

Beyond obvious things like checking for grammar and punctuation, your goal should be to see if you only read the comments and not any of the code, can you still grasp what the patch is intended to do? (to the best of your current understanding)

Pass 4: The "Actually read the issue" pass

Yes, that's right. This is pass 4. Why is this not pass 0, you ask? Because future Drupal users won't have the luxury of knowing all the various background discussions that led up to a decision, without breaking into some CVS annotate action. Having this background information before you go into a patch review gives you an unfair advantage. Remember that you can never un-read an issue, so make sure that you take copious notes of all the things you observed before doing this step. Once you've read it, your future reviews become poisoned by "insider" knowledge, and you can no longer share the same empathy for new developers.

During this pass, your goal is to figure out if all of the various twists and turns the discussion made are reflected in the code and its comments. Was there a tweaky bit of code that was explained well in one of the issue follow-ups? That explanation should be an inline comment above the code. Was there a great use case someone thought up for why this functionality would be useful? That should be reflected in the automated tests that come with the patch.

Pass 5: The "Does it actually work?" pass

Now, and only now, apply the patch, and verify that it does what it says it's supposed to do. Also try and think of other, sneaky things that might be related to this functionality change, or different ways that the change here might ripple out to other things in the system. For example, if the patch changes something about the user registration process, you might also think to test the login process, since they are related. And, if you have true Testing-Fu, you might also think to test OpenID, since it touches both, and bugs within it are not likely to be encountered in most peoples' day-to-day use of Drupal.

At this stage, you should also run all the related tests and make sure that they pass. While doing so, check for holes in the testing suite that either this patch should include tests for, or that should be filed as separate issues. The tweakier the change, or the more things it touches, the more we definitely need tests to confirm that it stays working.

Pass 6: The "Put it all together" pass

Now, with all that knowledge in your head, go back through the patch one more time, this time in the proper order, and jot down any other notes that come up. By this time your text file is probably 60MB and you have plenty of follow-up questions and discovered flaws to mark that sucker code needs work. ;)

So that is mine. Interested to hear what other people do, too!

Comments

This is fantastic -- I've been meaning to start doing code reviews, but have been a bit intimidated and keep putting it off. This guide will really help me get my feet wet. Thanks!!!

Submitted by webchick on

Listen, I *totally* understand being intimidated by code reviews. There's always this perception that people who participate in open source generally, and the Drupal core queue particularly, are these l33t super geniuses whose code is so wonderful it cooks you an eight-course meal so that you can then eat off of it. ;)

In reality, though, everyone makes goofy mistakes in their patches, or *completely* over-thinks things, and can benefit from a fresh set of eyes. Patch review is also probably the fastest way to ratchet up the "community karma ladder," not to mention learning a ton.

Whenever you (or any "you" reading this) are ready for your first review, pop into #drupal. There are normally an army of us in there who'd love to help. :)

Submitted by chx (not verified) on

My first is skimming the patch, usually problems stick out like a sore thumb anyways. And then I go over it more throughly and try to think up other ways to do the same and comparing which one is better. I usually give only a quick look to the issue, hunting for fixed space things -- benchmarks, code ideas of how to do stuff. Only big API changes merit a through issue reading.

Submitted by Damien Tournoud (not verified) on

I tend to act on two types of issues:

  • the bug issues: in those, I try to confirm the issue, understand its root causes (CVS annotate can be useful in pointing fingers :p) and find the best way to fix those. Generally I'm jumping on board when there is no patch yet, or when the patch tend to fix the symptom and not the issue (we have a lot of those). In those case, I either roll a patch or suggest a direction.
  • the tasks/feature requests: for those I tend to act like chx, trying to make the best out of the patch so that it becomes worthy of the attention of Dries and/or webchick. Direct reviews on the IRC have proven very successful in that matter.

I've been looking for some consistent method of testing patches, and so far I've done it backward a lot of the time: Read the discussion, then see if the patch works, then examine the code.

Reading the issue first is indeed not a good idea - usually all the discussion intimidates me and stops me from even looking at the patch. So much philosophical debate and deep thought has gone into it; what could I possibly find to improve?

Your solution of examining the patch first fixes that - if I find something, I can comment without having to read all the previous discussion.

(Although I still get red ears if I ask "What kind of silly design is THAT?" only to look back and see that it was proposed by Dries or chx or some other Drupal deity. ;) )

Submitted by webchick on

This is just the method I employ, but any approach to patch review is perfectly valid. There's nothing at all "backward" about reading the issue first (in fact, likely another patch reviewer will stumble across this post and go "Dude, are you NUTS?" ;)). I'm just more of a "bottom-up" person, so this approach works well for me. :) Sounds like it might be a good fit for you too!

And remember that "Drupal deities" are definitely not infallible. ;) Ask chx how often I've smacked his patches down to code needs work for things like incorrect coding standard compliance or even glaring parse errors. ;) And then he can tell you about the horribly inefficient (and sometimes insecure) code that I usually write the first time. :D But it's totally cool, because we all work together to improve the code and make it something we can all be proud of. :)

Also, never underestimate your "non-diety" perspective -- it's extremely valuable. Most people who are using Drupal are not Dries and chx. If the code doesn't make sense to you, it's a problem, and we should re-work it until it does. :)

Submitted by chx (not verified) on

Hey, I am just writing a few patches:) No, seriously, I am just a human with rather human traits like erring a lot.

Just because a patch comes from Dries or chx or merlin or any of a number of "high profile" folks doesn't mean it can't be absolute and total crap. :-) I've had to tear apart more than one of chx's patches for being both totally incomprehensible and wrong at the same time. And in hindsight, looking at, oh, the first 8 or so versions of DBTNG make me wonder why anyone lets me write code.

People who write good code do so because their peers didn't let them get away with writing crappy code. It's good to keep 'em on their toes. :-)

Great post. I do a lot of code review and usually have a lot of comments, but I don't follow any formal strategy. Here are some of the things I look for:

Before looking at the code, you may want to imagine how you would have done it (pass 0). If your idea is actually what is implemented, well, that's probably good. If it is not, and you think your idea is more obvious, the comments should explain why another route was chosen (pass 3). If you think your idea is less obvious but better, you should discuss this with the developer (pass 6).

When reviewing patches I use a mindset that the patch is bad, and that I have to prove that it contains bugs, performs badly, has security exploits, uses wrong coding style etc. If I fail to prove this, it's good. This sounds negative like a negative approach, but the point is that your judgments shouldn't be influences by whether you like the developer or not, or whether you are excited about the feature added by the patch.

Pass X: Check that Doxygen comments reflect what the function actually does, in particular that argument types and return types are correct. Any special cases should be mentioned (e.g. if a function returns FALSE under certain circumstances, what exactly are these conditons).

Pass Y: Check return types. Whenever some code calls a function foo(), make sure that the arguments are always of the types mentioned in the documentation of foo(), and the caller handles all possible return values from foo() (e.g. if a function usually returns an array but sometimes FALSE, make sure the caller handles both). Make sure the code doesn't blindly add if's to guard against when the return value occasionally is FALSE, and that it contains relevant notes about what circumstances causes foo() to return FALSE (if it isn't obvious) and actually does something relevant in these rare cases (and make sure all these circumstances are documented in the Doxygen, if they affect the return value of the function).

Pass Z: The helicopter: See if the patch follows the conventions used in the file/project with respect to variable names, code structure etc. rather than introducing its own. If a patch fixes one problem, make sure that it fixes all occurrences of this problem (including similar but not identical problems). If a patch introduces a new convention of doing something that isn't currently done in the project, make sure that the new convention is flexible enough to be used elsewhere too.

If you have a lot of comments and are reviewing a patch for a person for the first time, make sure you don't scare him. Tell him that it is quite normal that you have many comments, and explain him that the review process is a way to work together to improve the patch etc.

Submitted by webchick on

I actually learned quite a bit reading this. :)

Before looking at the code, you may want to imagine how you would have done it (pass 0).

I think this is a really smart thing to do for both seasoned developers *and* relative junior programmers. Both could be in a position to learn something if the way that ended up being done in the patch was extremely smart/clever, and both might be able to point out a much more obvious way to do something if it's not. :)

When reviewing patches I use a mindset that the patch is bad, and that I have to prove that it contains bugs, performs badly, has security exploits, uses wrong coding style etc. If I fail to prove this, it's good.

+100. This is an essential mindset to go into a core patch with. It should only be very begrudgingly, after *all* of the possible problems and edge cases are covered by the patch that you are *forced* to move it to RTBC because no more problems can be found. :D

However, it's very important that this be coupled with:

If you have a lot of comments and are reviewing a patch for a person for the first time, make sure you don't scare him. Tell him that it is quite normal that you have many comments, and explain him that the review process is a way to work together to improve the patch etc.

The last thing we want to do is scare new people off from contributing with 5 paragraphs of nit-picking. ;) So you're right; it's very important that we explain to people we know not to be "veterans" WHY we are doing this and help reassure them that none of these comments are about the individual; they're always about the code. And we make them so that the code is something we can all run on our production servers with the knowledge that they've been peer reviewed to the highest level of quality.

dang, and here I was, simply just doing step 5 and running the patch to see if it worked

that alone took me a whole lot of effort, being forced to work on a vista machine, it wasn't easy to find and configure some tool that could apply a .patch without making all these anti virus shiznitz go berserk on me

I actually manually copied and pasted .patch content for a good month :x

then again, I was never a good example of "best practise" ;)

thx for the tips, I'll try harder next time, even though it seems more overwhelming now :<

Submitted by webchick on

Again, I want to re-emphasize: this is the way I, webchick do patch reviews. Other people might do this completely different than this, and think I am crazy for doing it this way. ;)

I think when you're starting out, step 5 is a great place to jump in. Because just as people who can read and understand code might catch more things when they don't have the benefit of having read the issue, people who are really good at testing functionality might catch more things when they don't have the benefit of having read the issue OR the patch.

*Applying* patches on Windows, unfortunately, is indeed a huge pain in the butt. :( I'm not sure if Cygwin works under Vista, but that's how I used to do it back when I was on XP. If your web host lets you SSH in, you can also use your host as a patch testing platform, which is what some people do as well.

There's more information available at http://drupal.org/node/60179.