Older blog entries for louie (starting at number 107)

  • Got stymied over the weekend by !@@!# perl, but did figure out the basics for a very fast stack trace duplicate finder. Not perfect, but should mostly be isolated enough to not matter.
  • Lessig's new book rocks so far. Factoid: I knew Disney lifted all the Brothers Grimm tales, but I did not know that even the first Mickey Mouse talkie (Steamboat Willie) was a parody of Buster Keaton's Steamboat Bill, Jr.- which had come out only a year before. Live and learn.
  • "We're losing the 'war' on terrorism b/c it's a horrible way to define the scope of a conflict. Common nouns like 'terrorism' (or drugs or crime or poverty, for that matter) can't surrender and promise never to attack again the way that proper nouns like Germany or Japan can." --Paul Sholtz. Brilliant summation of a whole lot of failures.

The first step in my plan to take over the world is complete. I whipped up a little script that (after much pain) dumped the first five function names (more or less) from all crashes in bugzilla (more or less) into a new table, where I can glean experience and data from them.

The idea long term is that you'd be able to submit a bug and instead of the current time consuming manual stack matching after submission, or a very slow match on the bodies of the 400K comments, at submit time a very quick query on this new table could say 'actually, we think this is a duplicate of bug XXXXX- look familiar?' Less spam for everyone, less work for bugsquad, more accurate data on what things are reported most often.

Short-term there are three things I should do: first is to resurrect and clean up the old simple-dup-finder, which ran on the same principle as this experiment, so worst-case it is no more/less accurate than the old one, and a hell of a lot faster. Second is to figure out exactly how accurate it is- I think I should be able to do some queries that will be pretty revealing of how this auto-matching compares with human matching. Just have to figure out exactly how to do the comparison to get meaningful data. Third is to think about how to make this permanent. I think since it is all in a separate table, the upgrade risk is low- even if the table gets nuked in an upgrade, we could just re-run the script that created the table in the first place. More complex is any UI implications and how they are handled. At least at first these scripts will all live in a separate directory and not interact with the rest of the UI, so for the time being it isn't a big issue. We'll see, of course.

Let me be the first to offer Rob Love congratulations. And hopefully the last to offer him a job.

It's pretty sweet when you post a page with a 'patches accepted' text, and within a few days someone sends you a patch. :) Thanks, Tony.

26 Apr 2004 (updated 26 Apr 2004 at 07:13 UTC) »
  • Got to learn a little about left and inner joins today. Result is a patch query page that is dog slow but works. [Later] Sr. Willcox and Rob Adams goaded me into adding some indexes tonight instead of sleeping, and it totally paid off- queries went from ~30 seconds to ~3 seconds.
  • Triplets of Belleville ruled, as did some of the Sonoma County wines from our wine tasting.
  • Finally also saw Rashomon- more rocking there.

Random bits:

  • The Green Party of Canada is going to create their party platform for the next election via a wiki. Holy Shit. I hope some deliberative democracy folks are being given access to the diffs as they happen.
  • mofo.com is not what you'd think it would be.
  • Pipka: I apologize.
  • My data is not as hosed as I thought it could be, but I did lose all the talks I've ever given and a fair chunk of work stuff. I went from four copies of that directory to none in around 24 hours. Spectacular.
  • Going to a wine tasting tonight at The Kendall. Should be fun.
  • In my continuing quest to take mini-vacations where I listen to smart people, I'm spending most of the weekend doing pre-reading for ilaw.

It is not a good day when you're working until 4am, and your wakeup call is a reminder you've forgotten a 10am meeting. It is a worse day when your backup restore tool starts segmenting unexplainably. It is a much, much worse day when you rm -rf on the wrong machine, deleting something like 20% of the backup before you wonder 'why is this taking so long?' I think I'll have some bruising on my hand from hitting the wall. Hopefully it was only the incremental backups and metadata that were lost and not the main backup, but it is a bit hard to figure out exactly ATM. :/ If anyone needs me, I'll be the one over in the corner screaming.

More time spent talking to smart people today at Sloan School, including a couple RH folk. I talk to really, really smart people all the time, of course, but a new group of smart people stretches your brain in totally different ways, which is awesome. I think maybe if I had my way I'd sort of just stew in a mix of regularly rotated ridiculously smart people all the time, mostly just sitting in a corner while they talked and occasionally trying to ask an intelligent question. Would be a lot of fun. Sort of like GUADEC, year round, with topic changes. :)

Met some really interesting people today at MIT. Hopefully they'll find me interesting when I talk to them in eight hours.

Brutal but fairly productive 10 hour meeting today. Oy.

Spent a tiny portion of the weekend playing with xchat-gnome. Something lots of people have talked about, someone finally went ahead and did. Very cool.

98 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!