Older blog entries for mjg59 (starting at number 200)

2 Apr 2009 »

The curious tale of the driver that did nothing

Several vendors are now shipping Qualcomm's Gobi chipset, a cunning dual CDMA-GSM wireless broadband device. There's a driver for it in the Linux kernel called qcserial which claims to support it.

Do not be fooled. This driver is a vile lie.

The hardware comes up in a dumb state and requires firmware to be loaded before it'll do anything. The only way to obtain this firmware is from a Windows driver. The only way to load this firmware is under Windows. This isn't helpful, especially given that it drops the firmware whenever you use rfkill or suspend or power down the machine. In fact, the only way you can use this driver is to boot Windows, let it load the firmware, reboot into Linux, get online and then never turn off or suspend your computer or the radio.

So, don't be like me - swearing viciously and trying to generate useful USB packet dumps in an attempt to get the hardware working. Known bad parts are the HP un2400 and the Dell wireless 5600 - sony also have a Gobi part that's used in the P-series machines, and Acer have one as well. I'll update this if I ever get anywhere with a firmware uploader, but until then remember that the presence of a driver in the kernel doesn't mean that you can actually do anything with the hardware. Fyalcomm.

Syndicated 2009-04-02 18:43:23 from Matthew Garrett

27 Mar 2009 »

Reducing disk use

UNIX filesystems generally store three pieces of timing information about files - ctime (when the file was changed in any way), mtime (when the file contents, as opposed to its metadata, was last changed) and atime (when the file was last accessed by any process). This is a usefully flexible system, but the semantics of atime can be troublesome. atime must be updated every time a file is read, causing a read operation to instead become a read/write operation. This results in a surprising amount of io being generated in normal filesystem use, slowing the more relevant io and causing disks to spin up due to atime updates being required even if the file was read out of cache. It also results in a lot of unnecessary activity on flash media which may reduce their lifetime.

One option is to disable atime updates entirely. The problem with this approach is that certain applications depend on atime. This is especially common in mail clients which compare atime to mtime in order to determine whether a mailbox has been read since it was last modified. So, unfortunately, disabling atime entirely is impractical as a default. Back in 2006, Valerie Aurora submitted a patch that worked around this issue. The new relatime option meant that atime would only be updated if it would otherwise be older than ctime or mtime. Mail clients became happy and the world rejoiced.

Unfortunately, it turned out that there was one other common case of atime being used. Applications like tmpwatch monitor files in /tmp and delete them if they appear unused. In this case, "unused" means "has an atime older than a certain date". Since merely reading files doesn't update the ctime or mtime, relatime wouldn't cause the atime on these files to be updated and tmpwatch would happily delete them - even if users were reading them on a daily basis.

Ingo Molner submitted a patch to add a further heuristic to the relatime behaviour. With it, the atime of a file will be updated if it's older than mtime, older than ctime or (and this is the important one) more than 24 hours in the past. This deals with the tmpwatch case nicely, while still providing a significant reduction in the quantity of atime updates.

Fedora shipped this patch for several releases, and Ubuntu have used it by default since 8.04. Unfortunately there were some concerns over certain aspects of its behaviour (in respect to its interface as opposed to the relatime functionality itself) and it never got merged. I pushed a trimmed down version that purely implements the change to the relatime behaviour, and earlier today Linus merged it and a further patch that makes relatime the default behaviour on Linux.

Most users won't notice this change in behaviour at all, other than as a small improvement in io performance and a reduction in the number of drive spinups. For users that do have issues, a new strictatime mount option has been added - using this will require an updated mount command, but it's a trivial patch. I'd be surprised if there are any real world use cases that are negatively affected by this, especially since it's been default behaviour in several distributions for a while, but there's always the potential that someone will be tripped up by it. We'll see.

Syndicated 2009-03-27 00:21:51 from Matthew Garrett

25 Mar 2009 »

Today is Ada Lovelace day, a project to celebrate the women involved in technology and make it easier to give examples of female role models in the industry. I've had the opportunity to meet many women working on Linux over the past few years, and if there's one thing that's tended to overshadow the achievements and dedication to their work it's the sheer amount of effort they've had to go to in order to gain equal recognition. And that made me realise that in many ways, the woman who's had the greatest impact on my career is Hanna Wallach[1]. About ten years ago she spent a ridiculous amount of effort teaching me that it didn't matter how much I professed to be entirely free of sexism if I then proceeded to do things that implicitly excluded women from being involved in computing communities. I've gone to some amount of effort to repay that (contains profanity and a man with a raccoon tail covering his crotch), but I'm very aware of how different my life might have been if Hanna hadn't gone to the trouble of ensuring that I knew not to be a dick.

I think the Linux community has become more welcoming since then, and Hanna and people like her have been instrumental in helping that happen. However, we still have people being driven away by the behaviour of others or arguments that technical contributions are more important than social behaviour, and while that's true we need role models for social change just as much as we need role models for technical achievement. Thanks, Hanna[2]. The Linux world's a better place because of you.

[1] And while she's a role model of mine for social reasons, she's also got a PhD in machine learning, was British computer science student of the year in 2001 and came top of her MSc class at Edinburgh. So there.

[2] "Thanna"

Syndicated 2009-03-24 23:59:00 from Matthew Garrett

18 Mar 2009 »

ext4 and spinups

As a followup to my discussion of ext4, I did some trivial testing today. This consisted of generating a file, writing to it, closing it, opening another file, writing to it, closing it and then renaming the second file over the first. Then repeating about 10,000 times. fsync() wasn't called at any point. I used the /proc/sys/vm/block_dump knob to get an idea of when stuff was actually hitting disk. This was using the current rawhide kernel, which has the various workarounds for avoiding data corruption merged. The good news is that I was unable to get into an inconsistent state - the files always contained either the original or the new data, and the absolute worst case was having a zero length temporary file. The bad news is that while ext3 only touched disk when it reached the commit interval, ext4 touched disk for every single rename() call.

There appears to be a plan for dealing with this, but the current state of affairs is that ext4 will (under an obviously incredibly synthetic and unrealistic set of conditions) reduce the amount of time a drive can spend spun down. The next step is to attempt to measure this in a real world setup and see whether it's compensated for by the other new features.

Syndicated 2009-03-18 18:59:53 from Matthew Garrett

18 Mar 2009 »

Important typographical update

⸘Unicode 5.1 added the inverted interrobang‽

Syndicated 2009-03-18 02:27:06 from Matthew Garrett

14 Mar 2009 »

ext4, application expectations and power management

There's been a certain amount of discussion about behavioural differences between ext3 and ext4[1], most notably due to ext4's increased window of opportunity for files to end up empty due to both a longer commit window and delayed allocation of blocks in order to obtain a more pleasing on-disk layout. The applications that failed hardest were doing open("foo", O_TRUNC), write(), close() and then being surprised when they got zero length files back after a crash. That's fine. That was always stupid. Asking the filesystem to truncate a file and then writing to it is an invitation to failure - there's clearly no way for it to intuit the correct answer here. In the end this has been avoided by avoiding delayed allocation when writing to a file that's just been truncated, so everything's fine.

However, there's another case that also breaks. A common way of saving files is to open("foo.tmp"), write(), close() and then rename("foo.tmp", "foo"). The mindset here is that a crash will either result in foo.tmp being zero length, foo still being the original file or foo being your new data. The important aspect of this is that the desired behaviour of this code is that foo will contain either the original data or the new data. You may suffer data loss, but you won't suffer complete data loss - the application state will be consistent.

When used with its (default) data=ordered journal option, ext3 provided these semantics. ext4 doesn't. Instead, if you want to ensure that your data doesn't get trampled, it's necessary to fsync() before closing in order to make sure it hits disk. Otherwise the rename can occur before the data is written, and you're back to a zero length file. ext4 doesn't make guarantees about whether data will be flushed before metadata is written.

Now, POSIX says this is fine, so any application that expected this behaviour is already broken by definition. But this is rules lawyering. POSIX says that many things that are not useful are fine, but doesn't exist for the pleasure of sadistic OS implementors. POSIX exists to allow application writers to write useful applications. If you interpret POSIX in such a way that gains you some benefit but shafts a large number of application writers then people are going to be reluctant to use your code. You're no longer a general purpose filesystem - you're a filesystem that's only suitable for people who write code with the expectation that their OS developers are actively trying to fuck them over. I'm sure Oracle deals with this case fine, but I also suspect that most people who work on writing Oracle on a daily basis have very, very unfulfilling lives.

But anyway. We can go and fix every single piece of software that saves files to make sure that it fsync()s, and we can avoid this problem. We can probably even do it fairly quickly, thanks to us having the source code to all of it. A lot of this code lives in libraries and can be fixed up without needing to touch every application. It's not the end of the world.

So why do I still think it's a bad idea?

It's simple. open(),write(),close(),rename() and open(),write(),fsync(),close(),rename(), are not semantically equivalent. One is "give me either the original data or the new data"[2]. The other is "always give me the new data". This is an important distinction. fsync() means that we've sent the data to the disk[3]. And, in general, that means that we've had to spin the disk up.

So, on the one hand, we're trying to use things like relatime to batch data to reduce the amount of time a disk has to be spun up. And on the other hand, we're moving to filesystems that require us to generate more io in order to guarantee that our data hits disk, which is a guarantee we often don't want anyway! Users will be fine with losing their most recent changes to preferences if a machine crashes. They will not be fine with losing the entirity of their preferences. Arguing that applications need to use fsync() and are otherwise broken is ignoring the important difference between these use cases. It's no longer going to be possible to spin down a disk when any software is running at all, since otherwise it's probably going to write something and then have to fsync it out of sheer paranoia that something bad will happen. And then probably fsync the directory as well, because what if someone writes an even more pathological filesystem. And the disks sit there spinning gently and chitter away as they write tiny files[4] and never spin down and the polar bears all drown in the bitter tears of application developers who are forced to drink so much to forget that they all die of acute liver failure by the age of 35 and where are we then oh yes we're fucked.

So. I said we could fix up applications fairly easily. But to do that, we need an interface that lets us do the right thing. The behaviour application writers want is one which ext4 doesn't appear to provide. Can that be fixed, please?

[1] xfs behaves like ext4 in this respect, so the obvious argument is that all our applications have been broken for years and so why are you complaining now. To which the obvious response is "Approximately anyone who ever used xfs expected their data to vanish if their machine crashed so nobody used it by default and seriously who gives a shit". xfs is a wonderful filesystem for all sorts of things, but it's lousy for desktop use for precisely this reason.

[2] Yes, ok, we've just established that it actually isn't that in the same way that GMT isn't UTC and battery refers to a collection of individual cells and so you don't usually put multiple batteries in your bike lights, but the point is that this is, for all practical intents and purposes, an unimportant distinction and not one people should have to care about in their daily lives.

[3] The disk is free to sit there bored for arbitrary periods of time before it does anything, but that's fine, because the OS is behaving correctly. Sigh.

[4] Dear filesystem writers - application developers like writing lots of tiny files, because it makes a large number of things significantly easier. This is fine because sheer filesystem performance is not high on the list of priorities of a typical application developer. The answer is not "Oh, you should all use sqlite". If the only effective way to use your filesystem is to use a database instead, then that indicates that you have not written a filesystem that is useful to typical application developers who enjoy storing things in files rather than binary blobs that end up with an entirely different set of pathological behaviours. If I wanted all my data to be in oracle then I wouldn't need a fucking filesystem in the first place, would I?

Syndicated 2009-03-14 21:04:01 from Matthew Garrett

6 Mar 2009 »

After a bit of back and forth with Peter, we came up with a straightforward way of dealing with the fact that the Wacom driver needs a logical input device per input type[1], but the X server only generates an input device per hal device. The simplest solution turned out to be a hal callout that generates additional hal devices on demand, which also means we can add information to the fdi files to only add the appropriate device types. Ought to land in rawhide in the near future, at which point tablets should be basically working out of the box. Except that xsetwacom gets device name -> type mapping by attempting to parse xorg.conf. Pass the suicide.

Today's other accomplishment was spending long enough looking at Toshiba ACPI dumps to figure out how to enable hotkey reporting without needing to poll. Of course, I then found that the FreeBSD driver has done the same thing since 2004. Never mind. Patch has been posted to lkml and I've shoved it into rawhide, so that'll improve things for most Toshiba users. There's a few machines that have an entirely different BIOS and we don't know how the hotkeys there work at all, so life continues to be miserable for those of you that own them. Sorry.

[1] Stylus, cursor, eraser and so on

Syndicated 2009-03-06 04:46:26 from Matthew Garrett

23 Feb 2009 »

Misc

Minor Fedora updates - I fixed up the FDI file in the wacom package, so tablet PCs should have a working stylus out of the box in rawhide. The eraser won't work right now - the driver needs some reworking to bind multiple X devices to a single logical input device. I've also added support for brightness control via smartdimmer to nouveau, which should increase the number of machines that have working brightness control. I don't think this has landed in the rawhide kernel yet, but should do soon. There's the potential for some conflict with the mbp_nvidia_bl driver. We may end up dropping that.

HP updates - The button bar on my 2510 got replaced last week. I now have working volume buttons again. However, the machine now reboots whenever the machine is suspended and I close the lid. Diagnosed to either a faulty switch assembly or system board, which will require an engineer visit. An engineer dropped round today to fix the touchpad. Despite the case notes clearly stating that the problem was with the cable assembly, he was sent a replacement top cover unit. Without any cables. So he's coming back at some point. HP's customer support system apparently does not allow these cases to be merged. Which means I now have two visits to look forward to.

Android - I'm gradually working my way through the code, replacing various custom interfaces with standard ones. const char * const LCD_BACKLIGHT = "/sys/class/leds/lcd-backlight/brightness"; is an interesting standout so far.

Syndicated 2009-02-23 18:43:34 from Matthew Garrett

19 Feb 2009 »

In other news, my HP 2510p's screen was replaced last month after the hinge snapped. A chunk of plastic off the hinge cover snapped off two days ago. I'm somewhat puzzled by this, since I can't see any plausible way force could be applied to it - it's as if it came away slightly and then got crushed when I tried to close the lid. On top of the motherboard having been replaced 4 times now (twice due to faulty power connectors, one due to the fan being replaced and the motherboard being swapped at the same time, once because the machine started refusing to boot at LCA) and it still being slightly tempremental when booting, I'm not overly impressed - especially when I've only had it 18 months. Nobody else I know with one seems to have had the same level of difficulty, though dreadful thermal issues (especially when using the dock) seem to be common.

The X200s looks awfully shiny, but the 1440x900 screen option doesn't appear to be available in the UK. Oddly, despite having a SIM slot, it also doesn't seem to come with an HSDPA option.

Syndicated 2009-02-19 01:29:01 from Matthew Garrett

19 Feb 2009 »

Aside from the inherent humour in Opensolaris's attempt to migrate to a 15 year old shell, today brings the thrilling news that I'll be moving to Boston to join the engineering team in Westford, MA. I look forward to the Applebee's. Some of the more entertaining aspects of US immigration mean that it'll probably be in July at the earliest (365 days with the company, plus time spent in the US since starting), which means that I have plenty of time to properly investigate my local pubs to console myself over having to spend the rest of my life drinking American beer.

Syndicated 2009-02-19 01:18:20 from Matthew Garrett

191 older entries...

New Advogato Features

FOAF updates: Trust rankings are now exported, making the data available to other users and websites. An external FOAF URI has been added, allowing users to link to an additional FOAF file.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!