Older blog entries for Zaitcev (starting at number 415)

BadName is essentially conquered

The issue with random applications failing to start (Firefox, Nautilus) or blowing up (panel, gvim) with BadName took me about 3 months to find (the bug was filed at the end of January). I'm not sure if my fix is any good, need to poke Ajax about it.

So... Wasted a lot of time, learned several mildly interesting things about the code and people involved.

The sad part is how much it takes to start moving around any modern codebase, and that's with the same language and toolchain. I remember times when no part of the system was off-limits, but these days... not so much. If anything breaks in OpenOffice, I'm not even going to try fixing it.

Syndicated 2008-05-02 10:23:00 (Updated 2008-05-02 10:24:01) from Pete Zaitcev

23 Apr 2008 (updated 23 Apr 2008 at 21:15 UTC) »

Ted Tytso on [Open]Solaris

Ted suddenly decided to talk OpenSolaris. Pretty interesting... at least for me, since I spent 7 best years of my life in Sun's orbit.

In passing, aside from the bulk of the post, it seems to me that the final argument, about competitors selling Solaris support, does not hold water. This is exactly what Oracle attempted with their clone of CentOS and they weren't very successful, despite having a strong Linux team under Wim.

Other than that, he's probably right. But he's going to get responses. Whenever I mention Solaris (last time it was when I linked to Jeff Bonwick's blog), I get the most inane responses from Solaris fanboys. It looks like a very vocal community of users, if not contributors. Sounds like Apple almost.

This puts the damper on any dreams I may have about re-living the glory of my youth by getting back to hacking on that codebase.

UPDATE: Not sure why Levon decided to post his reply to his personal blog instead the one at Sun. Surely the other one is more relevant?

Syndicated 2008-04-23 18:12:46 (Updated 2008-04-23 20:38:59) from Pete Zaitcev

Random dmesg errors

I always was against kernel spewing user-generated errors into dmesg, like this:

npviewer.bin[4393]: segfault at f6712030 ip 67e7a0 sp ff9c39ec error 4 in libpthread-2.8.so[677000+15000]

Not helpful, not interesting.

However, the other day my desktop keeled over in a strange way... The /var/log/messages contained this (followed by a stack trace):

Apr 13 18:19:14 niphredil kernel: Xorg: page allocation failure. order:3, mode:0x4020

It looks like a bug in SLUB (does not seem registering with anyone who has the power to track it down though). But my point is, without the printout I would need to find what was happening by other means, and that would probably take forever.

Hmm... My world is shaken.

P.S. kgdb was merged into 2.6.26. The sky is falling.

Syndicated 2008-04-19 19:29:06 from Pete Zaitcev

Jon Corbet on Red Hat and Desktop

Seen at LWN today (no permalink — what the heck?):

Red Hat's desktop team has posted an item saying that the company has no plans to offer a "traditional desktop product" anytime soon.

Say what? The referenced item says:

[W]e have no plans to create a traditional desktop product for the consumer market in the foreseeable future.

Umm... RHEL desktop is doing quite well, all we're saying we're not committed to selling it at Best Buy. Not sure how this debacle has happened. Jon was probably short on coffee.

Syndicated 2008-04-17 17:28:04 from Pete Zaitcev

ipv6.google.com

If client is logged in, Google bounces to the old site. In order to access over IPv6, you have to log out and use http://ipv6.google.com/webhp. Apparently, not quite there yet.

One funny thing, using Google while logged out is much faster. Apparently it takes time for them to act upon cookies my client sends. Remind me again how they goaded everyone into this "homepage" thing. Ah yes, Gmail.

Syndicated 2008-04-16 16:30:45 from Pete Zaitcev

15 Apr 2008 (updated 16 Apr 2008 at 02:09 UTC) »

Fallback-induced thoughts

I saw two or three bug filings in last couple of months which deal with a USB device not working until ehci_hcd is unloaded. Thinking sensibly, it's rather normal, a poorly-made or poorly-cabled device may choose to report High (480) speed yet will be unable to communicate at that speed. And a couple of devices failing across half a million of users is rare. However, the thing is, such cases were extremely rare before, I don't even remember the last time this happened. So, I'm starting to worry that EHCI hardware or software may have a subtle bug somewhere (perhaps specific silicon percolated to the field).

If only there was a way to tap into Novell's bugzilla and watch their kernel bugs, to collate with ours. Ditto the Bligh's Bugme and Ubuntu's whatever (Launchpad?).

For readily identifiable bugs, we just report them to linux-usb or whatever and then patterns just come together, but the problem of fallback-wannabe devices is too flimsy and vague.

P.S. By "fallback" I mean the new code which switches a port over to a Full (12) speed if enumeration fails. It's a practical solution, but it seems like sweeping the problem under the carpet to me. Also, it won't work for anything that's plugged into a hub.

UPDATE: Amit from Ubuntu pointed to their bug 88746. V.interesting.

Syndicated 2008-04-15 21:29:40 (Updated 2008-04-16 01:47:52) from Pete Zaitcev

Unpleasant mass updates

Mass updates to Bugzilla have a few unpleasant side effects:

  • Unless they're done by DKL with direct access to the database, they generate a lot of e-mail which buries actual updates.
  • They destroy the usability of queries for "bugs modified in the last 60/45/30 days". It's a useful trick I learned from Arjan. But now all kernel bugs are recently modified.

The idea, I guess, is that developer has to rescan relevant bugs and either work on them, push them into NEEDINFO, or close them. If hackers are dilligent about it, auto-closer is harmless [and also, unnecessary -- ed]. In reality though, it just does not work that way [and the very existence of auto-closer is the proof -- ed]. At certain point, I started making extra-Bugzilla lists of bugs which look realistic to work on (e.g. have an active submitter who cooperates, for one thing). The rest just rots. I don't even have cycles to push WONTFIX on them (or, actually, I have time to close, but I don't want to deal with the fallout, so I just pretend not to see them -- the task made easier by the mass-update and the resulting mail avalanche).

P.S. My list of bugs is, like, 10 to 50 times smaller than Chuck's and DaveJ's. I don't understand how they cope. It seems impossible to me, so there must be some trade secret good kernel monkeys know.

Syndicated 2008-04-14 22:45:03 (Updated 2008-04-14 22:46:34) from Pete Zaitcev

The Belgian paper

Completely useless. Gee, the thugs running worst shitholes of the world can forge documents signed by children and make all their Web access trackable and non-refutable. Dog bites man. We knew before this paper that they condition children to carry their own telescreens. The only thing I want to know about the BitFrost is how to defeat it, and the paper doesn't say. Useless.

Syndicated 2008-04-09 15:51:11 from Pete Zaitcev

Timeouts

I didn't try to burn a CD with ub in a while, because my new laptop comes with a built-in burner. After all the hustling with __blk_end_request, I thought the situation called for a test. This looked worrysome:

Track 01: Total bytes read/written: 548321280/548321280 (267735 sectors).
Errno: 5 (Input/output error), close track/session scsi sendcmd: cmd timeout after 5.000 (480) s
CDB:  5B 00 02 00 00 00 00 00 00 00
cmd finished after 5.000s timeout 480s
cmd finished after 5.000s timeout 480s
wodim: Cannot fixate disk.

The resulting CD was not a coaster though. A welcome surprise, but clearly I did something wrong regarding timeouts, and it needs fixing (although I'm quite sure that there's no other person on Earth who would want to burn CDs with ub).

BTW, the new cdrecord looks nice indeed. Before, I only used the one maintained by that self-centered dude with attitude... No idea who maintains this one, but it seems working ok.

Syndicated 2008-04-07 03:28:25 from Pete Zaitcev

22 Mar 2008 (updated 2 Apr 2008 at 01:08 UTC) »

What Would Rusty Say?

One of the many great things Rusty has done was introducing the Misuse Levels of APIs (in OLS 03 keynote, slide 30 and beyond). I had a run-in with something of that nature last week.

Here's an interface:

/**
 * blk_end_request - Helper function for drivers to complete the request.
 * @rq:       the request being processed
 * @error:    0 for success, < 0 for error
 * @nr_bytes: number of bytes to complete
 *
 * Description:
 *     Ends I/O on a number of bytes attached to @rq.
 *     If @rq has leftover, sets it up for the next range of segments.
 *
 * Return:
 *     0 - we are done with this request
 *     1 - still buffers pending for this request
 **/
int blk_end_request(struct request *rq, int error, unsigned int nr_bytes)

What do you think the "number of bytes to complete" is? It seemed natural to me that it's the number of bytes which was transferred (and thus, it can be smaller than the number of bytes remembered in the request). This is how I would design an API. But in this case, nr_bytes is the number of bytes which was in the request initially. As such, it is greater than the request->data_len, which drivers modify to indicate the residue.

I think this has something to do with Tomo's & Jens' desire to avoid modifying drivers which poke ->data_len today (indeed, the code doing so in ub remained unchanged). If so, the price is too steep, IMHO.

Curiously, the designers of the API themselves misused it when they converted ub. They called __blk_end_request() with and argument of blk_rq_bytes(rq), but since ub modifies ->data_len, it guaranteed a failure for packet requests.

Everything seems to be working now, but I suspect that 2.6.25 is going to ship with a broken ub (thank Chris Wright for the Stable Tree).

UPDATE: See also a blog article (same server, but helps if Rusty decides to reshuffle his home directory).

Syndicated 2008-03-22 04:49:37 (Updated 2008-04-02 00:22:24) from Pete Zaitcev

406 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!