Older blog entries for titus (starting at number 444)

Exhibiting aggressive competence

This last term I facilitated the participation of five MSU students in the Undergraduate Capstone Open Source Projects (UCOSP) program, in which students do distributed open source software development and receive home institution credit. UCOSP was managed out of U Toronto by Greg Wilson, and I was (and am) enthusiastic to participate as it's clearly a good way bring open source into education.

However, I was less thrilled to see that the majority of the MSU students received, ahem, "less than passing" grades from their project leaders. I knew about the problems in one particular project from having met with the students on a regular basis, but the other results caught me by surprise. I would love to kick and scream and complain that I should have been made more aware of what was going on -- and where I had constructive things to suggest, I did -- but the more important failure may have been a mismatch between the MSU students' approach to these projects, and project expectations.

The students variously had a number of problems, ranging from team miscommunication & poor conduct to an inability to get the software to compile. This meant that for several students, no visible work got done -- for example, in one project, it regularly happened that person X was working on a patch, and person Y committed an overlapping patch first. Or on another project, person Z spent two months trying to get the basic project infrastructure compiled, and was reduced (at the very end) to submitting code fixes without testing them in the full project context. Or several times, person A spent a week working out how to refactor a test into something reliable, and resulted in what looked like (and maybe was) a trivial code change.

All of these situations may result (and did result) in low evaluations. This is understandable: no visible work got done, so how is an evaluator supposed to grade them!? Yet, all of the situations are legitimate issues that block progress. What is a student to do?

The answer won't be too hard to guess for anyone who has worked on real-world team projects: make your struggles visible.

Someone steps on your patch? Fine -- submit your patch too, and explain why it's better (or worse) than the first patch. Code review the other patch, while you're at it: who better to do the review than someone who really understands the issues? Then when you get poor marks for not having contributed code, point at your patch. (You are using version control, right?)

Can't compile the software? Fine -- write down what's going wrong, and post it publicly. Document your fix attempts. Ask for help. Bash your head against the wall repeatedly. Either fix the problem, or document the problem thoroughly. Either someone will help you, or you'll figure it out, or you'll leave an audit trail so that others won't have to do all that fail work. Then when you get poor marks for not having contributed any patches, point out that the project has technical issues and either no one could help you (project FAIL) or you spent all your time fixing them.

Trying to debug niggling details that turn out (in the end) not to involve big impressive code changes? Submitting too many unimpressive patches that no one seems to value? Write down why your contributions are valuable. At the end of the day the evaluation may (rightly or wrongly) be "not too smart, but sure did work hard" -- but that's better than "no evidence of any work having been done".

Note how a lot of this seems to involve communication? Right -- that. For team projects, being an effective communicator is more important than being a kick-ass programmer.

At the end of the day, there are things you can control, and things you can't control. You can't control what other people think of you, and you can't control how other people (including project leaders and professors) evaluate you. But you can visibly work hard, and defend yourself based upon that evidence.

I call the general approach of throwing energy at a project "aggressive competence", and I think it's a necessary component of effective team software development. Everyone has days, or weeks, or even months where they look incompetent or ineffective; often that's because outsiders don't understand or appreciate the work that you've done. Tough on you, but I don't think it's reasonable to expect your boss, or colleagues, to look hard at your work to find reasons to praise you. Fundamentally, it's your responsibility to "manage up" and communicate your progress to others effectively.

This is where I think there were mismatched expectations. The students expected that they were going to be managed, helped, and given clear expectations. They weren't. So they got bad evaluations.

In open source projects (and elective college courses) the immediate ramifications of a poor evaluation may not be clear -- I'll leave you, dear reader, to figure out the longer term consequences. But I think the ramifications of a poor evaluation are immediately obvious in the context of a capstone course, or a paying job.

Incidentally, this illuminates one of the reasons why I'm such a big fan of UCOSP: it is reality. You're working on an existing project, with other developers, at a distance; and it's not anyone else's responsibility to frame the problem for you. It's your responsibility to make progress.

What do I plan to do? Well, assuming that UCOSP + MSU goes forward next term, I will be communicating my expectations quite clearly to the students. And I will be asking for regular progress reports, sent to me and CCed to the project leaders. And I'll be sending them this blog post. And I'll be failing the ones that don't listen.

I'll end with a paraphrase of one of my favorite sci-fi authors: "every new developer has problems on a new project. The extent of our sympathy for those problems, however, will be dictated by the efforts made to overcome them."

--titus

p.s. It's also a good way to figure what projects you don't want to work on: I once got dinged for working too hard in a company; I was told that I was "rowing too fast and the boat was going around in circles." My response (that perhaps others might consider rowing faster) was not received well. That's the kind of job situation you can leave without guilt (as I did).

p.p.s. Code reviews can be an extraordinarily effective passive-aggressive way to correctively interact with jerks on a project, too.

Syndicated 2009-12-18 19:10:45 from Titus Brown

A Tale of a Bug

or, "those python-dev people are awesome."

My experience with the Python bug tracker has been pretty sparse and largely limited to some of the eternaissues like "make HTMLParser deal with even more broken HTML" that never really get resolved because they're not very important and don't have a champion. So when I filed this minor bug report on test_distutils I was not expecting it to be on anyone's top 10 list.

I filed the bug report at 17:21 and dropped a note to python-dev.

Within an hour or so, several people reported that they could not duplicate it, and one person had reported that they could, on the mailing list.

At 19:58 Ned Deily reported through the bug tracker that he could dupe it.

By 20:36, Tarek had a suggestion for something to try (that wasn't the problem, but never mind).

By 21:58, Ned had tracked it down to a UNIX-y flavah difference (BSD vs SysV) in the way group ownership was set on files.

And at 22:30, Tarek had fixed the problem (by dropping the assertions).

So, umm, wow, that was quick! Just over 5 hours, across at least two continents, on a Sunday...

---

I was impressed by how many people chipped in to get a broad spectrum view of things, how quickly some hypotheses were generated & in the end how quickly the issue was resolved -- for a relatively minor problem in one corner of the test suite that didn't show up on any of the buildbots.

I'd bet that part of this speedy response is because Tarek Ziade has taken on stewardship of the distutils code. If so, this highlights how important it is to have people who feel responsible for various bips and bobs of the stdlib & just do whatever needs to be done.

Oh, and it also highlights the value of continuous integration across many machines; I saw the problem because I was running tests inside pony-build in a certain way on a Mac OS X 10.5 machine, and the error wasn't tripped in the 10.6 buildbot.

--titus

Syndicated 2009-11-30 03:10:17 from Titus Brown

Comparing two Python source files for equivalence

We've been doing some wholesale PEP 8 reformatting over in the pygr project and we're trying to figure out how to review the changes to make sure nothing inadvertently got broken. Most of the changes are in spacing and line breaks, not in variable names, so I think what we really want is a whitespace-ignoring diff utility. That's not that difficult to find -- except that we're coding in Python, so relative whitespace indentation levels are significant!

Any suggestions for tools that could help?

I've tried several approaches based on tokenizing and using ASTs, but I'm not that clever. I'm going to take another look at processing two token streams, and if that doesn't work, I think 2to3 may hold the key...

--titus

Syndicated 2009-11-26 14:20:12 from Titus Brown

Lazyweb query: CloudStore (or KosmosFS)

Does anyone have any experience with CloudStore, formerly known as KosmosFS? From http://en.wikipedia.org/wiki/CloudStore:

CloudStore (KFS, previously Kosmosfs) is Kosmix's C++ implementation of
Google File System. ... CloudStore supports incremental scalability,
replication, checksumming for data integrity, client side fail-over and access
from C++, Java and Python.

The project site is here: http://kosmosfs.sourceforge.net/

I'd be interested in comments on general usability, quality, and "feel"...

thanks!

--titus

Syndicated 2009-11-25 04:14:30 from Titus Brown

Diversity in a Nutshell

Since a few people have asked, here's a rough guide to the diversity discussion. No specifics allowed.

1. diversity list created to (among other things) ponder an official diversity statement for Python. List is closed-archive but open for general subscription.

2. Various diversity list discussions become heated. Some people (including myself) leave list in response. Sigh of relief, back to normal life; is that a good response?

  1. A few weeks pass.

4. Diversity list discussion hunts me down on psf-members and tries to pounce. Narrow escape.

5. Proposed diversity statement from diversity list posted to psf-members for discussion and hopefully? approval; diversity discussion engulfs psf-members list like a revenant whale.

6. 1000s of messages pass. Or at least many dozen. People agree, disagree, agree to disagree, disagree on their agreement, and otherwise cause trouble by collectively failing to accept any part of the proposed diversity statement. (Tho it's actually much more complicated than that.) Troubling and unprovable accusations of widespread anti-diverseness in the Python community are softly bowled across the lawn.

7. Diversity discussion from psf-members cross-posted to diversity list. Non-PSF members on diversity list freak out at the idea that the PSF might adopt a diversity statement that did not take into account some of the issues they had discussed. Hurt feelings ensue, including frustration by various people that other people are doing things they don't want them to do, in complete violation of expectations. Troubling and unprovable accusations of fairly specific anti-diverseness in the Python community are left, steaming gently, on the lawn. Closed nature of both lists engenders and amplifies confusion.

  1. Still no diversity statement from the PSF on the horizon.

Things have quieted down for the evening.

Personally, it's been the most unpleasant set of interactions to watch and (occasionally) participate in that I've seen in the Python community in a long time; one can only hope that we reach some form of passionate agreement in the future:

Agreement in a group setting is truly a wonderful thing. But we should be wary of agreement that comes without any work, any disagreement, and disruption. We must never mistake quiet civility for passionate agreement.

(See this link for the whole post from which that quote is taken; Godwinning is unintentional but, frankly, a rather ironic endpoint to my meanderings.)

My new theory? It's all a plot instigated by the Perl community to distract the Python community so that Perl 7 can get the jump on Python 4k. It's the only way I can make sense of it all.

--titus

Comments closed, because I just don't care what anybody thinks any more.

Syndicated 2009-09-17 04:11:46 from Titus Brown

More GHOP -- conference call on Friday

As I wrote over the weekend, the Google Highly Open Participation contest (intended to get high-school students involved in open source work) may be run again this winter. I say "may", because quite a bit of work needs to be done on the GHOP hosting app, Melange.

We in the Python community are in a uniquely Good position to help out with this: Melange is written in Python, on Google AppEngine, using Django. It would be great were a horde of Django experts to descend upon Melange to offer their help. Melange also could use some help testing; any testing experts out there that want to donate their time?

If you want to get involved, please attend the IRC meeting on 18th of September 18:00 UTC on #melange.

thanks!

--titus

Syndicated 2009-09-14 13:50:29 from Titus Brown

GHOP to run again; HELP.

The contest formally known as GHOP is going to run again this fall, and we need your help.

Yes, you. YOU, over there in the corner. Stop avoiding this post!

GHOP, for those of you who don't remember or weren't around 2 years ago, was the very successful pilot sister program to the Google Summer of Code that involved 13+ yro students from countries around the world (excepting only the Axis of Evil) in open source work. Nearly 400 students (!) participated and there was much rejoicing. (Summary post here, and all of my blog posts on Python's GHOP here.)

The good news was that GHOP was a big success from the perspective of many people: unlike the GSoC, which requires a substantial time investment from the mentor, and is only intended for coding work, GHOP involved byte-sized chunks of work in all areas (docs, testing, etc.) and rewarded both students and mentors for even a little bit of participation. In a signal of GHOP's success, by the end of the contest coming up with new Python-based tasks was easy -- people were literally throwing them at me, because they saw the rate at which existing tasks were being completed! I know that GvR was happy with the doc patches that made it into Python, and Andre Roberge gives GHOPpers a fair bit of credit for their contributions to Crunchy; there are a number of other success stories, too, including when Kumar told me that a task was too big and open-ended and then a 13 year old took the task and aced it, proving that I am not always wrong to ignore Kumar.

The bad news was that running GHOP was an immense amount of work, largely because of a lousy infrastructure -- Google Code isn't intended for this kind of thing, but we had to use something Google-hosted because it was a contest.

So what did Google do? They created the Melange project to help provide infrastructure for the GSoC and the GHOP both. It was used for the GSoC this last summer, and despite its rough edges, it worked out quite well.

Now Google is running GHOP again, and they're aiming to start December 7th. Unfortunately, in order to make that happen, they need a LOT of help on Melange.

Where do YOU come in?

Well, presumably you're a Python coder. You may be an expert in testing. You might be a Django nutcase. You're probably a Web developer (and odds are you've written your own Web framework, too, but never mind).

And guess what Melange is written in?

That's right, the best language on Earth (or at least a reasonable facsimile of it) -- Python.

You already know the language.

You already know how to use it in anger, to make the computer do your bidding.

Why not put your skillz to use?

I will be hitting up specific people and specific lists once we know when the IRC meeting to discuss Melange development is. Why not save yourself the aggravation of feeling guilt when you get my e-mail in a few days, and just sign up the Melange dev list right now?

---

Seriously, GHOP was awesome last time and we got a lot done for quite a few different Python projects. This time, we're older, more experienced, and better prepared to take advantage of GHOP. Join us, and you will become more powerful than you can possibly imagine!

You can find a list of areas where Melange devs feel they need help right here. I look forward to seeing YOU working on them!

--titus

Syndicated 2009-09-12 03:42:08 from Titus Brown

How the Python stdlib changes (a public service message)

In the interests of social anthropology, I feel compelled to point Pythonistas at this fascinating discussion on the stdlib-sig on adding argparse to the Python stdlib. (Yeah, it's pretty much the only traffic that list got so far this month.)

Fascinating stuff. If there's a secret cabal out there masterminding Python development, they are clearly rather poorly organized ;)

--titus

Syndicated 2009-09-12 02:27:04 from Titus Brown

Buggy Python code?

I'm looking for examples of frustratingly simple-yet-wrong Python code, suitable for an undergrad class to debug. I'd prefer things that don't rely on tricky features of Python (like shared list references), but rather code where subtly bad logic or program flow leads to bad behavior.

Comment below, or e-mail me; I'll post the ones I pick later. thanks!

--titus

Syndicated 2009-09-09 02:13:57 from Titus Brown

Chickens are not a rate limiting factor

My wife and I were talking with my USDA collaborator about some possible chicken research, and I asked about access to animals. His response? "Chickens are not a rate limiting factor."

Did you know that 1 million chickens are slaughtered per hour, on average, in the US? Wow.

--titus

Syndicated 2009-09-06 23:59:31 from Titus Brown

435 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!