Older blog entries for apenwarr (starting at number 59)

Trust No one

adewhurst linked to Paul Graham's recent article about business and Open Source. Adrian's question essentially is: so is this a great article, or does it just sound really convincing?

The answer is, from just the context of the article, you don't know. The problem is that the author suffers badly from what's called the selection bias: wanting to believe something, and then examining only the evidence that supports your belief.

I learned about the term "selection bias" by reading Harvard Business Review, a really excellent magazine, even if you think (like I do) that most businesspeople are stupid. Ironically, I can use HBR as a counterexample to Graham's point about "blog knowledge" being more interesting than stuff published by traditional companies. Meaning no offense to Paul Graham, nearly every article in every issue of HBR is better researched and more instructive than any of Graham's essays that I've read. Why? I don't know, but if you follow my link, you're going to have to pay to read those articles, and Mr. Graham gives away his knowledge for free.

Has anyone ever told you about selection bias before? If not, then like me, you're going to make lots of stupid decisions. Browsing the Internet didn't get me that information. Paying HBR some money did.

So, what specifically in this article suffers from selection bias? Nearly every statistic and every fact in the entire article. Not to say that he's wrong... but the proofs are all invalid.

For example: Open Source produces better software, eg. Linux and Firefox. The evidence? People - maybe more than 50%, depending who you ask - are "switching to them" from their proprietary counterparts. That's pretty convincing, right? Not really. First of all, they haven't switched yet. The proprietary software still has the vast majority of the market. Secondly, Linux is obviously inferior to Windows for certain types of uses, ie. anything a small business might want. (There's no way a small business could figure out how to set up a Linux system to do, say, what Microsoft Exchange does out of the box.) And Firefox? Okay, maybe it's better, although I find it horrendously slow, and by the way, Microsoft hasn't released a new version of IE in something like three or four years. (Longer, if you consider the fact that nobody could really tell the difference between IE 4 and IE 6 just by looking.) Firefox has had a lot of time to work on becoming just barely comparable to IE. But, even though Microsoft just politely let them catch up for a few years (something I claim was their actual, smart business strategy!), Firefox is just now becoming barely better than IE. But yes, some people are switching to Firefox. Congratulations.

You can prove anything you want with statistics. Apache is the best web server, yes. Linux is the best OS platform, maybe. Firefox is the best web browser at the moment, quite possibly. But three examples doesn't prove anything; what's the best graphics package, word processor, music library, media player, game platform, game, or accounting package? The answer to almost all those questions isn't in Open Source. Now, by skipping some examples I didn't like, I just used selection bias on purpose to prove my point. Did you notice? And do you see why, after this whole rant, we still don't know if Paul Graham is right or not?

If so, at least you've learned something. Hope it works out for you.

It's not about the music

It's about the solid-state twirly wheel. Oh, the carnal happiness...

A Beginner's Guide to Art Appreciation

I think the worst job in the world would be to be a full-time art critic. Cursed with the ability to understand what makes something great, but not the ability to make such great things yourself, you're forced to look at other people's inferior work and try to explain what's wrong with it. How depressing.

On the other hand, if you have no idea what you're doing - as in, for example, my case with most kinds of art - it's much easier to appreciate. "This person can do something I could never do." There's something admirable in that. It makes you feel a bit safer about the world, knowing that even if you can't solve all the problems, at least there's probably someone around who can fill in the blanks.

This is where I could put in a bit about innocence lost and the fact that happiness enabled by ignorance is always unsustainable, but that would kind of ruin my point, so I'll leave it out.

On Distributed Intelligence

Yesterday I and my friends were looking for the site of the Canada Day fireworks (see if you can find them by browsing the web site!). We were initially having some rather bad luck, but were getting reassured by meeting with occasional people who would come to us and ask, "Is this where you can go to find the fireworks?" Being asked the question so many times, of course, implied that there were many people who at least had some idea that this was the right place.

But since we didn't know where we were going either, it eventually became more clear: we asked a few other people for directions, and they gave the same answer as us: "I hope so!" But people were walking in some direction or another. At one point we actually caught ourselves following a group of people who, five minutes earlier, had asked us whether this was the right place. Some consideration led us to this important life observation:

If nobody around you knows where they're going, the people who know where they're going are somewhere else.

Upon realizing this, we departed quickly in search of people who knew where they were going. Happily, we found the site in our second-guess location of where the fireworks ought to be, right before the show actually started, so it all worked out for the best, and with a free philosophy lesson to boot.

(For the morbidly curious: you should know that the Quai Jacques-Cartier is apparently not the same as the Pont Jacques-Cartier. Perversely, the fireworks last night were at the Quai, but the fireworks tonight are at the Pont.)

Death by Abstraction

I put in some work this weekend trying to make UniConf maximally palatable to the KDE project. Since UniConf does pretty much everything that can be done, or else can rather easily be extended to do so, you might think this would be an easy job. Not exactly, because of one tricky bit: performance.

UniConf uses a rather neat abstraction, introduced to UniConf by Jeff Brown a couple of years ago, called a "generator." It's one minor step up from a filesystem in a Unix VFS. Because of the slightly more/less rigid (depending how you look at it) rules a generator has to follow, UniConf makes it easier to stack layers of generators together to achieve what you want, and so people do. For example, there's an ini-file generator, a tcp client-server generator, a D-BUS cilent-server generator, at least two kinds of generic caching generators, a list-of-generators generator, and so on. As of yesterday, there's also an automount generator (take that, autofs!) and a filesystem one-file-per-key generator, the two of which work together (with the ini-file generator) in a slightly complicated way to handle reading/writing your ~/.kde/share/config directory.

The power of the UniConf abstraction is that you can easily rearrange these individual parts to do something similar but different: for example, if you wrote an xml-file generator, you could combine it with the same one-file-per-key and automounter generators to produce something that can read/write your ~/.gconf directory instead. That's a lot less work than writing a tree-of-xml-files configuration system from scratch, and you benefit from other people having debugged the other generators you're using.

But in a system this general, performance can be a problem. In my initial simple-minded experiment, switching to UniConf took the start time of "kwrite" from 1.0s to 25s. Okay, so that was a bit extreme, and with minor tweaking I got it back down to 3.0s. But still: that's 2/3 of kwrite's startup time just fiddling with config values. Obviously not okay.

Now, what's amazing about Jeff's abstraction is that it's so complete - the slowness is really not the fault of the abstraction layer at all, because the way it works, the front end matches up almost exactly with the backend; that is, the API you call is pretty much the same as the API the generators each implement. So you could write a *very* fast generator if you wanted - optimally fast, because you can throw away all the layering and just implement the API calls directly. It's just that, because it's so easy to combine pre-existing modules instead of writing something from scratch, most UniConf setups end up being pretty badly non-optimal.

Here's the real killer example: profiling my UniConf-enabled kwrite shows that much of the time is actually spent converting UniConf keys to and from KDE QEntryMap objects. (KConfig keeps a QMap of all the groups/entries in the tree.) To make it "optimally" fast, I should really store all my config entries directly in a QMap, rather than storing them in a UniConf-style data structure and then copying them to a QMap. But there's clearly no way to do that outside of rewriting the pre-existing UniConf generators.

Since writing a completely special "KDE generator" wouldn't be "unified" enough for me, I'm going to instead try to optimize the other generators more until the performance is acceptable. The profiler shows some pretty obvious starting points. Flexible and fast - not the easy approach, but maybe the most fun. Wish me luck.

Confident statement #1:

"I know my place in the world."

Even more confident, but contradictory, statement #2:

"I make my place in the world."

Conundrum

How will you get from #1 to #2?

8 May 2005 (updated 8 May 2005 at 00:57 UTC) »
Canada and Copyright

There's been quite some discussion lately about Canada's proposed amendments to copyright law, and I'm afraid most people are completely misunderstanding the situation. What you have to realize is how incredibly clever these amendments are. (You also have to realize that a copyright system intended for printing presses, vinyl records, and live CBC radio broadcasts does need an update for the day of the Internet, and in itself, that's a very good thing.)

To dispel some scary misconceptions:

  • The "notice and notice" system they'll impose on ISPs is incredibly lax compared to the draconian "notice and takedown" U.S. system. When an ISP is informed that their customer may have posted infringing material, they have to... inform the customer about it by forwarding them the complaint! Uh, I'm slightly stunned that we even needed a law for this. Oh, they also have to keep a copy of it around to avoid tampering with potential evidence in case a lawsuit follows. Okay, so ISPs should make backups of their systems occasionally. What's the problem?

  • "The alteration or removal of rights management information (RMI) embedded in copyright material, when done to further or conceal infringement, would constitute an infringement of copyright." Don't you realize what a great miracle of phrasing this is?? It's only illegal to remove the RMI stuff if it's done to conceal or further infringement. In other words, hiding your tracks after doing illegal stuff - still illegal! What amazing insight! But notice that what it doesn't do is change the definition of infringement. Removing RMI to make your own damn music play on your music player after you paid good money for it is not infringement. (At least, not based solely on this amendment, and I haven't heard of any other amendments that affect the situation.)

If you feel like writing to your MP about Canada's copyright improvements, please do - but congratulate them for doing the right thing, and encourage them not to later cave to U.S. pressure. The last thing you should do is accuse these people, who apparently actually do have your best interests at heart, of being exactly the kind of people they're trying hard not to be.

Disclaimer: IANAL, but chances are that neither are you. At least I read the bloody web site before posting.

Like Nature, Only Better

Today was an ideal spring day, and it reminded me why I like Montreal the way I do.

I had nothing better to do, so I went for a walk in Parc Mont Royal. The Sunday "Tam-Tams" session was in progress, which as usual made me think of self-organizing systems, but that's only tangentially relevant to my story. I found a little stream of water coming down the mountain - it's been raining a lot in the last few days - and decided, again, because there was nothing better to do, to find out where water comes from.

"Up" is the obvious answer. I traced the stream to a culvert and out (well, in) the other side. Then another, and finally to a hole where it seemed to disappear underground - well, appear from underground. Not to be stopped that easily, I followed the slope a surprisingly far distance uphill, and found at least two places where streams of about half the size as the original one went underground. There being only one of me, I decided to follow only one of them.

The slope was getting steeper as I got to the more "serious" parts of the mountain, and I noticed my stream - it was "my" stream by now - was shrinking. Looking back, I could see that a few little irrelevant trickles, feeding from every direction, joined my stream occasionally; as I passed each one, my stream, of course, reduced just a little bit each time.

Time passed, and I followed my stream to what eventually turned out to be its source - another tiny, irrelevant trickle, just like all the ones I had passed, coming from a small puddle in some wet grass. There was no big impressive lake draining, no mountain spring shooting water from a crack between two rocks... just another trickle, just like all the others.

I try not to post anything here unless it has a point. Perhaps you think my point will be something trite, like, "There's no ultimate source of power, and nobody is very strong all by himself; real power comes from a lot of little, irrelevant ones acting together." Yeesh. I hope you give me more credit than that.

I looked back down the hill toward where I had started. One thing I hadn't quite noticed on the way up struck me: that the walking path through the park seemed to stray back and forth rather near my little stream, without ever getting wet. In fact, the streambed wasn't entirely random; every so often, a culvert, or ditch, or cobblestone bed, or bridge, appeared, designed to keep the stream flowing smoothly out of the way of the path without allowing any flooding. Humans had been there before me, not just tracing the path of the stream, but adjusting it, controlling it, making it just slightly different so that it would work the way they want.

People are not the little, irrelevant trickles that combine together to become a mighty torrent; don't think of it that way. The irrelevant trickles and mighty torrents are just the forces of nature. People are the things that - with the right, tiny effort, applied in just the right places - make those torrents go just the way we want.

And this is why I like Montreal: other places I've been, like Toronto or Ottawa or New York or even Thunder Bay, give you the feeling of a great battle with nature that we won; to build New York, you can imagine that we gathered some trickles, created a big torrent, and blasted it at everything in sight until we built a monument to our power. There's nothing wrong with that - New York is very inspirational that way - but that's not how I feel in Montreal. What I feel here is that we handled things more like we handled that stream; we let nature run its course, and adjusted just a few things along the way. No excessive force; just a little tweak here, a little tweak there. The mentality here is different that way. I like it.

21 Apr 2005 (updated 21 Apr 2005 at 17:41 UTC) »
dcoombs wins

"I bet I'm going to be delicious!" the sheep said, a tad pre-sumptuously.

Deep Thoughts

Must come up with pun involving the word "pre-sumptuous." It's a tricky one.

50 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!