webchick.net

very helpful lioness

Diaries of a Core Maintainer #6: A tale of two developers

Sun, 01/25/2009 - 18:58 -- webchick

(Hey, neat! This was mentioned on ZDNet: Perfectionists need not apply.)

Most people in the free software community have probably read Eric S. Raymond's The Cathedral and the Bazaar, which compares and contrasts two free software development models: the Cathedral style (think, "perfectionism") and the Bazaar style (think, "chaos").

While Drupal itself is clearly an open, Bazaar-style project, many individuals in the Drupal community tend to take the perfectionist approach to development. After all, *thousands* of people will be using this code, and likely *hundreds* of developers carefully inspecting its inner-workings. How embarrassing would it be for another developer to stumble across a "no-brainer" bug in your code? Best to sit on things until you know it's really solid before putting it out there in front of people. Right?

Sure, that sounds logical. But in my experience, people who take this approach to development in an open source community, and especially the Drupal community, are at a severe disadvantage to those who embrace the chaos and put their changes out in front of everyone as they're going along, warts and all.

I'll attempt to illustrate this by way of a dramatization comparing the problem-solving approach of two hypothetical Drupal developers: Sloppy Sam, and Perfectionist Pat.

Sloppy Sam

Sloppy Sam is a hobbyist, who has been coding PHP off and on for a few months. One morning, she gets an idea for a new feature for Drupal core. She begins by searching the issue to see if something like it exists already and, finding it does not, posts a new issue with a general description of her idea. She then opens her IRC window and asks in #drupal, "Hey, folks! I had an idea for a new feature for core that [brief summary]. What do you think about this? [link]" Chatty Charlie from #drupal reads the issue and says, "Hey, that sounds pretty cool! You should talk to Fancy Fran about this, because I remember her talking recently about doing something similar."

Since Fran isn't online right now, Sam decides to whip out her editor and slap together some basic code that has the general jist of what she's trying to do, and posts it around lunch time to the issue, marking it "patch (code needs work)." She again asks in #drupal, "Ok, here's the general approach I'm thinking: [link] Does this seem pretty good?" This time, Fancy Fran is online, and replies, "No, actually. you're going to run into problems with caching. Here, I've pastebinned a copy of a module I wrote before that did something similar. This will probably help you."

Delighted, Sam takes the code from Fran and merges it into her own work, and posts another patch at supper time, this time upgrading the issue to "patch (code needs review)." Each time that Sam posts a new follow-up to the issue, it's bumped up to the top of the main issues listing. This causes Reviewer Rodney, who spends lots of time browsing this page, to stumble across the issue. He recognizes Sam's name from other issues he's seen, and since he knows her to be an active contributor, decides to spend some time thoroughly testing the patch and reviewing the code. Rodney leaves detailed comments about code standards violations, missing tests, points out a logic problem in one part of the code, and suggests some clarity to the documentation. He then marks the issue back to "patch (code needs work)."

Sam thanks Rodney for his comments, and makes the necessary adjustments just before having her evening tea. When finished, she sets the issue back to "patch (code needs review)." She then asks in #drupal for a "review swap," which both helps to get her patch reviewed *and* gets her more exposure to other parts of Drupal she might not otherwise have encountered. Chatty Charlie, who is already up to speed from having talked to Sam about her idea earlier, takes her up on it and gives the patch a quick test drive and makes a once-over through the code and then proclaims it "patch (reviewed & tested by the community)," just in time for bed. By the time she wakes up the next morning, it's been committed, and is now part of the latest release of Drupal.

Perfectionist Pat

Perfectionist Pat, a brilliant computer engineer with ten years' experience in software development, envisions a new feature for Drupal core. He begins to plan his improvements as he would any of his other projects: with a piece of paper and pencil, carefully analyzing the problem and perhaps drawing out some flowcharts and diagrams to help him to understand the existing code flow and how his improvements fit in.

He sees that there are a few different ways of approaching the feature, and consults a variety of reference material: books, RFCs, and specification documents. After a few days of evaluation, he is confident that the second of his three choices is the most optimal way to go.

The next day, Pat sits down at his IDE (which is naturally configured to automatically conform to Drupal's coding standards to the letter) and begins carefully crafting and documenting the various functions, hooks, and page callbacks he will need. As he goes through, he also writes automated tests so that he can ensure that every branch of logic is thoroughly tested.

Within another three days, he has the functionality nearly completed. Or so he thought! Upon manual testing, he comes across something he hadn't originally forseen: there is a problem when caching is enabled! Pat smiles smugly, proud of himself for catching this before it ever saw the light of day in the public eye. He re-evaluates his earlier analysis, and begins to refactor his solution to take this new information into account.

Finally, just ten days later, Pat has a patch that he feels is core worthy running on his computer. He takes it through the final paces of SimpleTest and Coder modules, just in case, and then posts a new issue to the issue queue, marked "patch (code needs review)." Now, he waits until someone comes across the issue to give it a review.

Because Pat's patches tend to be fewer and further between than Sam's (since his are always uploaded perfectly the first time), Reviewer Rodney doesn't recognize Pat's name when he clicks on the issue. And since Rodney's is a bit pressed for time today, he moves on to another patch instead.

Pat begins to grow frustrated, and comes into #drupal demanding to know why no one has taken a look at his perfect patch that he toiled over for more than a week. Chatty Charlie takes pity on Pat, and decides to click into the issue. He recognizes it as the very same issue that Sam was working on just yesterday, and since hers has already been committed, Charlie marks Pat's issue as a duplicate. Infuriated, Pat clicks into Sam's issue intending to tell her off and show her how a real programmer solves this problem, only to see that through the community review process, Sam came up with exactly the same patch, only in one tenth of the time, and with far more detailed PHPDoc comments, thanks to Rodney's review.

Observations

Obviously, these are two very extreme, made-up examples that paint both a rosy, best-case scenario and a tarnished, worst-case scenario to make a point. In reality, most people and most situations are somewhere in the middle.

Both approaches outlined here are a reasonable way of "getting things done" in the Drupal community, and both also have their time and place. There are definitely times when it makes a lot of sense to get out a whiteboard and start whipping up flowcharts before jumping into coding. And there is also a limit to the community's patience for how many times you bring them in to look at what is still sloppy, first-attempt code. And despite anyone's best efforts, there are definitely times when you can't get people to bite at the community review process, no matter how much you try, and you have to go it alone.

But, if we analyze Perfectionist Pat's approach, we see that it had some detrimental effects:

  1. Pat is out of touch with what the Drupal community is doing. Note how his first instinct is to sit down and solve the problem, not look to see what others in the community were doing. Had he posted an issue when he began to solve the problem, not only would he have "staked out" the fact that he was working on it, but Sam would've found the issue in her search days later, and could've worked with him rather than solving the same problem in parallel. This ended up wasting both of their time.
  2. Pat works in isolation. As a result, he is missing out on the biggest benefit of using an open source product: the thousands of other developers who understand the system innately and can help him short-cut his learning curve. By not working with the community, he also misses the opportunity to do things like "review swaps" that can help him get more eyes on his code. By not working in the open issue queue, he also misses out on the opportunity for more eyes to stumble across the work he's doing and offer guidance/reviews.
  3. Pat fails to earn "karma" in the community. Even though her initial few attempts ended up not being correct, because Sam had more "touch points" of positive interaction with the community, her name stands out more from the crowd. This has a lot of benefits, including people being more likely to find time in their day to help her out. In fact, Pat's frustration-laden comments in #drupal after he didn't get his patch reviewed probably earned him negative karma in the eyes of some, which will have the exact opposite effect.

One other interesting thing to note is that all of the "famous" Drupal developers that you've likely heard of take Sam's approach. And here you thought they were all so smart. ;)

Conclusion

Regardless if you're a Pat or a Sam, when working within the Drupal community, keep the following points in mind:

  1. Resist the urge to isolate yourself until "perfection" is achieved. It can be normal in a "day job" (as well as school projects) to lock yourself in a room for 3 days and come out with a shining example of code. But doing this in an open source community is missing out on what is arguably the biggest benefit of using open source -- other peoples' expertise with the system. You can still achieve the same perfection in your code you're used to; you can just do it much faster by talking to other people as you go along.
  2. Be open and transparent. Begin any problem solving session with a search of the queue, and a new issue if one doesn't exist. The sooner the Drupal community catches a glimpse of your work, the sooner someone can jump in and make suggestions. Post your progress to the issue as you go: interim patches, questions, summaries of IRC discussions, etc.
  3. Scratch other peoples' itches while they're scratching yours. When asking something from others, be prepared to give something in return. For example, "review swapping" or answering questions in #drupal-support, or clarifying documentation. This will not only help ramp up your "karma" within the community, but can help you gain experience in different parts of Drupal you wouldn't otherwise be exposed to.

Follow these guidelines, and you should have a much easier time with whatever you do in Drupal. :) And hope to see you in the issue queue, all you Pats out there! :D

Comments

Submitted by webchick on

And sorry, Perfectionist Pat. You know we love you. :)

I like this example a lot, and I think it definitely applies not only to Drupal but to a certain swath of free software projects (including stuff like debian, probably). There are some projects that are organized differently (both in terms of code modularity or developer tasks) or projects of different sizes, where the metaphor doesn't quite line up, but that's alright. I'm a big picture guy anyway, and it's a cool way to start thinking about how the bazaar works *on the ground*.

---

Does the structure of the review process, and the issue queue push people to be more, I don't know, "reserved" about submitting patches? Is the following thought process relevant or usual: "I need this to be reviewed and accepted by [big name] or [someone I respect] or [person with power] so it better be good."? Particularly in light of style guides, and their partisans? I'm not arguing against reviews or style guides at all, but rather toying around with perceived social effect of these mechanisms on submitters.

Drupal has a lot of semi-formal "procedure," in place to manage the community (given its size), and while procedure can be codified, and imposes values that are productive for the project as a whole (style guides, issue management tools, etc.) it raises the barrier to entry a bit.

At the same time, taking away procedure, and making it *all* chaotic (a la the way that I understand Linux Kernel development happens) doesn't solve these issues either as communities would tend to isolate themselves, and develop their own kinds of informal hierarchies that have many of the same issues.

----

And the final thought has less to do with core-space and more to do with contrib-space, where (it seems to me) many individual modules take a much more top-down (cathedral) approach to development and process. In part this is because successful bazaar projects need extant code in order to really work (eg. Linux from Minx; gnu emacs from gosling emacs), and with modules being pretty small, contrib-developers may feel like they need to get to a certain place before releasing the project. More mature module projects also seem to be relatively top-down because the teams that work on them are smaller and the amount of "play"/"chaos" at work between a dozen (or even two dozen) people is much different than the amount of entropy that exists among a group of tens of thousands of people.

---

Anyway, thanks for the essay.

Submitted by webchick on

"Does the structure of the review process, and the issue queue push people to be more, I don't know, "reserved" about submitting patches? Is the following thought process relevant or usual: "I need this to be reviewed and accepted by [big name] or [someone I respect] or [person with power] so it better be good."? Particularly in light of style guides, and their partisans?"

Yes, this definitely comes into play. Core patch reviewers and committers have a well-deserved reputation for being merciless when it comes to things like coding standards adherence and ensuring that the optimal solution for any given problem is reached. The prospect of marching your code past this Flaming Hall of Judgment, particularly when you're a relatively new contributor, is absolutely terrifying to people.

Contrib feels a lot safer, since it's your private sandbox and you can decide what standards you do and don't adhere to, what documentation you do and don't write, and unless you're a popular "oughta-be-in-core" project like Views or CCK, you can rest relatively re-assured that no one is ever going to inspect your code very closely, as long as the module's working as intended.

I think our challenge here is to raise awareness that:

  1. Reviews are always about the code, never about the coder. It can be very difficult to separate "me the developer" from "the code that I wrote" but once you do, you can really start to appreciate the value of a thorough patch review.
  2. Giving patch reviews and getting patch reviews from others are one of the fastest and most effective ways to raise not only your Drupal IQ, but also your general development IQ. I had 7 years of experience with PHP before I started with Drupal, but helping in the core queue caused me to actually feel like I "knew" the language. And I still pick up great new tricks every single day.
  3. When people nit-pick every last mis-placed apostrophe in your code, they are doing so because they care. And I don't mean that in a tongue-in-cheek way; Drupal core powers hundreds of thousands of websites, including enterprise-class web applications, schools and non-profits, businesses, and more. It's imperative that every line of code goes through intense scrutiny because ultimately millions of people are affected. And because Drupal development is done completely by volunteers, who may flit in and out of the community at a moment's notice, it's also very important that we keep a high level of documentation and coding standards quality, so that anyone could jump in and become an expert in a particular aspect of the system. Because inevitably, some day we will need someone to do that.

I'm very open to suggestions on how to reduce these barriers to people. I'm toying around with some things like a "quick fix" tag to highlight issues that are a bit easier to solve/review, and have started to centralize various core initiatives at http://drupal.org/community-initiatives/drupal-core to help show people looking to make a bigger splash in the community where to jump in. The Patch section of the handbook recently got a revamp to provide much more information to people who'd like to get started either contributing or reviewing patches. But obviously, we still can (and need to) do more. Any ideas you or anyone else has would be fantastic. :)

And the final thought has less to do with core-space and more to do with contrib-space, where (it seems to me) many individual modules take a much more top-down (cathedral) approach to development and process. In part this is because successful bazaar projects need extant code in order to really work (eg. Linux from Minx; gnu emacs from gosling emacs), and with modules being pretty small, contrib-developers may feel like they need to get to a certain place before releasing the project. More mature module projects also seem to be relatively top-down because the teams that work on them are smaller and the amount of "play"/"chaos" at work between a dozen (or even two dozen) people is much different than the amount of entropy that exists among a group of tens of thousands of people.

Yep, in a lot of ways, contrib is a different animal, since you aren't going to get dozens of people looking at every line of your code (under the best of circumstances you might get one or two), and in general it's one person (or a very small team of people who all know each other) working on a given module/theme.

However, you can still re-use the "Sloppy Sam" approach, even in contrib:

  1. Develop the project in drupal.org CVS, not in a private SVN repository or similar. In fact, you should try and make a rule that the only drupal.org CVS checkouts of modules can be put in your private SVN repository (apart from that token client_name.module that form_alter()s out some stuff here and there). This way, bugs don't get fixed for you unless they're fixed for everyone. And you never get into that awkward discussion about "Oh, yeah. Right. We meant to contribute that back, but..."
  2. Create public issues for all of the items on your todo list, and mark them fixed (with patches attached) as you finish them. Not only does this make your CVS annotate history much more helpful to future archaeologists, but it provides visibility into what you're thinking so that others could stumble across and lend advice, testing, and sometimes code. And my favourite benefit of doing this is that it forces you to think through the project in bite-sized chunks that have a clear description of what needs to happen.
  3. Talk about what you're working on in #drupal as you go through. You might attract the attention of someone who says "Oh, you don't actually need to do that because you can use the Blarg module" or you might come across someone who says "OMG! I need exactly that thing! I am around for testing this entire week!" or similar.

Thanks a lot for your thoughtful post!

An example of point 3 is the views_attach module. At Palantir, we realized we had a need for a certain module. We discussed what features it might need, how it should work, etc. Before writing it, though, I decided to ask in #Drupal-Views if anyone knew of something along those lines. Up popped none other than code machine eaton, who said he'd already written something very close but never made a real release out of it. We talked for another hour, I picked up his code, dusted it off, added a feature or two, and two hours later had a new module in CVS. It's apparently become rather popular already, and because it's in CVS for all to see and use there's a number of feature requests against it already, some with offers of patches to come that I'm looking forward to. :-)

Hi Angie,

Thanks a lot for pointing to the Quick Fix list, I wasn't aware it existed. This reminds me of your patchnewbie idea that apparently has died. Perhaps it's a good idea to reinstate the patchnewbie user/tag since there's a big difference between a Quick Fix and a Newbie Fix. If we want more people to post and review core patches then we need to break some barriers. The Quick Fix breaks the important time barrier and the Newbie Fix could break the learning curve barrier and give people a head start.

Submitted by webchick on

Patchnewbie was a horrible hack to get around the fact that we didn't have issue tagging. It required logging in as a separate user (and since only me and one or two other people knew the pass, it was very ad-hoc), assigning an issue to ones' self, then logging back in as your normal user to do your "regular" work, etc. Not fun, and not ultimately sustainable.

I was going to re-use "patchnewbie" as a tag once we got issue tagging, but "quick fix" is better because it's applicable to documentation issues, design tweaks, etc. There might be a difference between quick and newbie, but not much of one. Or at least, I've been using the tag the same as I was using the patchnewbie user.

I haven't announced it widely yet because I was waiting to talk to Charlie Gordon to see if we could use "quick fix" as a way of implementing the "DROP" program we tried to run after GHOP. Once I've had that discussion, I'll go ahead and pimp it everywhere. ;)

Submitted by Tony (skiquel) (not verified) on

You put a lot of work into your Core diaries. Just giving you a courtesy comment to say its thought provoking and helpful

Submitted by Arancaytar (not verified) on

I've been at both of these places and know the frustration of duplicating work. As well as the frustration of starting to write something and then simply having to give up because it's beyond me. I was partly into the Views PDO rewrite when I had to stop and admit this wasn't a one-person job. ;)

I see I really need to subscribe to your blog because as of now, I usually only hear about one of these brilliant posts weeks later when it's mentioned on #drupal or in an issue. :D

OK, this is *so* totally awesome, because there's 2 patches (features, rather) I've been waiting to see in Drupal and a really important contrib module. I'd been letting it off because I thought it might not get that much attention (yeah, I know this sounds kind of absurd) but I guess I'll get my pieces (as Sloppy Sam) together now. :-)

Highly inspirational. Cheers!

Submitted by Anonymous (not verified) on

Well, if I was a Perfectionist Pat (in fact I'm neither of them nor someone even in-between) I would feel a kind of offended and insulted after reading this article and would probably look for somewhere else to contribute where I would feel more welcomed...

I guess its only me....

Submitted by webchick on

That's good feedback, thanks. To anyone out there who makes it this far, I meant no disrespect to any Perfectionist Pats out there, and I really do apologize for anyone I might offend by this post. I am a Perfectionist Pat at heart, too (just ask anyone whose patches I've reviewed ;)), so the sarcasm in this post is really poking fun at myself, not you. :)

Back when I was a Google Summer of Code student, which is how I got started with Drupal, I would do the Perfectionist Pat thing: hole away in a corner, furiously typing in code, drawing up diagrams, stepping through debuggers, poring through documentation, etc. (Students are sort of used to this way of working, and we end up seeing this same pattern every GSoC with at least a few participants.)

After a few days of doing this, I still didn't really "get it" (I'll save a separate rant for another day about how poorly school prepares students for working on other peoples' code), and I was starting to panic about finishing my project on time. So finally, I found the courage to ping my mentor Robert Douglass and... ask some questions. Because, you know, I didn't want to appear "stupid" by asking questions (even to the person whose "job" it was to help me), so instead I burned through three days being totally frustrated and not accomplishing anything. ;) And he said, "Stop doing that." ;)

The short time line of GSoC basically forced me to completely change my approach to problem solving and adopt the Sloppy Sam style. And as soon as I started talking to people and throwing my work out for everyone to see as I was completing it, my Drupal life took a radical change for the better. I started making friends, I started learning all kinds of things at a much more rapid pace than otherwise would've been possible, and the "big picture" of Drupal came into focus within only a couple of weeks. I don't even want to think about how long it would've taken me had I kept with my "native" approach to problem solving.

So this post is just an attempt to save other fellow Perfectionist Pats out there the same trouble I had to deal with. :)

I'm a bit of a Shy Sheryl, so I've been nervous to expose my mediocre code to all the Awesome Angies on Drupal. But this post has inspired me. I will (gulp) post to an issue queue.

Thanks!

Submitted by webchick on

See you in the queue, "Joyful Jami"! ;)