Older blog entries for mjg59 (starting at number 265)

USB runtime power management

I've just committed some patches to the rawhide (not F14) tree that re-enable USB autosuspend on some devices. This set includes a workaround in the bluetooth input code that should handle the case where people were seeing their input devices become laggy when autosuspend was enabled, but there's still some chance that other bluetooth devices will behave slightly oddly. If that's the case then try:

echo on >/sys/class/bluetooth/hci0/device/power/control

and see if it improves things. If so then please file a bug and include information about the device you're trying to connect to.

The other thing this patchset does is enable USB HCD runtime power management by default. This means that if you have a host controller with nothing connected, or with connected but suspended devices, the host controller will be powered down. It'll automatically power back up if a device is connected or a connected device sends a wakeup event. The obvious failure mode here is that your USB ports stop working. If that happens, please file a bug and include the output of lspci and dmidecode (you may want to edit the dmidecode output to make sure that your serial numbers are removed). In both cases, assign the bug to me (mjg at redhat.com).

Once this seems fairly stable I'm going to be looking at powering down more mostly unused PCI devices. SD readers and firewire controllers are typically unused and both support generating wakeup events, so they're the next step.

Syndicated 2010-09-17 18:07:18 from Matthew Garrett

Thoughts on upstreams

Last month I gave a presentation on the interaction between Android and kernel upstream at Linuxcon. The video for that is now available here (requires registration). Contrary to stories you may have heard, I do not dropkick anyone through a window.

There's some parallels between the Android/upstream scenario and Canonical's approach to upstream. Mark wrote a lengthy defence of Canonical's focus on components that they feel need development, while not putting development effort into things they feel are good enough already. That's pretty consistent with the discussions I had with him at the Ubuntu development meeting in Oxford over six years ago. Back then the focus was on taking all the excellent software that already existed and concentrating on providing it as a single polished and integrated product. It was successful - what's easy to forget now is that the first release of Ubuntu was massively more usable out of the box than any other Linux distribution available at the time, and it's absolutely undeniable that its release spurred increased efforts on the part of competitors. But I don't think the same focus is being applied any more.

The most obvious (and most controversial) example is the Ayatana project. Ayatana's a pretty explicit statement that Canonical don't view the existing Gnome UI as being suitable for their vision of the Linux desktop. That's absolutely fine. However, unlike many of the papercut projects, Ayatana is a set of complete reimplementations of functionality that behave differently to their upstream equivalents. There's no meaningful sense in which that's not a fork of Gnome. And, let me emphasise, I don't think that's a bad thing.

Let's go back to Android. While we talk about how much we'd have preferred it if Google had interacted with upstream before implementing wakelocks, the realistic outcome is that there's no way that we'd have accepted them even if they'd been posted years before hardware shipped. If there'd been a productive outcome to that aspect of the conversation it would have involved Android having a very different power management policy. Sometimes you're not going to convince people that something is better without implementing it and seeing how well it works. The problem with wakelocks is not that they exist or that Google feels they work better than the alternative, but that they end up integrated into drivers in a way that can cause divergence between mainline and Android.

Ayatana potentially has the same outcome. At some level, well-integrated applications have to be aware of the environment in which they're running. The application indicator patches are an example of that. Carrying conditional code increases maintenance burden, reduces test coverage and is a net loss in the long run. If upstream Gnome never gains support for this functionality, and if Ubuntu continues to depend on it, then Canonical have a delta that they'll have to carry forever. That benefits nobody.

Forking because you believe that your approach is better is a completely valid development model, but in the long run can cause problems if you don't have a long-term strategy for how to resolve that fork. For all we criticise Google's ability to get Android code into the mainline kernel, they've put orders of magnitude more effort into doing so than Canonical have in terms of getting Ayatana's code into mainline Gnome. This isn't a function of Google's relative size - the Android kernel team is on the order of 10 people. It's not enough of a difference to explain the disparity.

Canonical would be perceived as much better team players if there was an indication of their long-term plan in terms of Unity and the Ayatana projects and getting that code into mainline Gnome and integrating with the Gnome shell. It's completely unsurprising that they're viewed with distrust until that happens.

Syndicated 2010-09-15 13:37:09 from Matthew Garrett

Linux backlight control

Backlight control is one of those things that you'd think would be simple, but ha ha this is computing so of course it's an utter disaster and everything is a huge mess. There's three main classes of backlight control in the x86 world, all of which have drawbacks:

  • ACPI specifies a mechanism for backlight control, and the majority of modern machines implement it. It has the advantage that the brightness query interface is generally aware of anything else in the system which may have changed the brightness, so it's unlikely to get out of sync with reality if the platform tries to do something odd like change the backlight itself in response to an ambient light sensor or some other event. The main drawback is that there's typically a fairly small number of available backlight values, usually somewhere between 8 and 20.
  • A platform-specific mechanism. This used to be more popular before the ACPI backlight interface took off, but some machines still require it. The idea here is that there's some sort of platform-specific way of requesting a backlight change, ranging from a vendor-specific ACPI method through to triggering system management calls by magic register writes. These methods usually (but not always) keep in sync with other firmware changes, but rarely provide any more brightness steps than the standard ACPI interface.
  • Many mobile GPUs have backlight control registers built in. These usually give you a range of several thousand possible values, but using them will almost certainly leave you out of sync with reality if the firmware touches them at all. To make things worse, the firmware control of the backlight may occur after the gpu - so you could end up with two different controls that both need to be full in order to get maximum brightness. The worst case scenario is that the firmware gets confused by the values not being what it programmed and you end up with a hung machine.

Right now, if there's an ACPI backlight interface then that's usually the only thing we'll show you. We can do that because we can identify if there's an ACPI backlight interface when we parse the ACPI tables at the start of booting, and that information can be registered before we start setting up any other backlights. The problem comes when we have no ACPI backlight interface. We don't have any idea whether there's a platform mechanism until a platform driver loads, which could be at any time. As a result, we've been reluctant to expose GPU-level backlight control because doing so would often give you two separate backlight controls and no indication as to which should be used. Userspace doesn't really have a way to make that decision either, so everyone ends up unhappy.

This is especially problematic with some machines which provide no ACPI or known platform control (or, in the case of some Samsungs, only provide platform control if you have a special Linux BIOS that Samsung won't give you) but can control the backlight via the gpu. Right now you get nothing, because giving you something would potentially break other systems and the needs of the many etc. Sorry! But this is obviously problematic in the long term, especially because multi-GPU machines tend to have multiple ACPI backlight interfaces, so I've been working on a better approach.

When a backlight device is registered, it appears under /sys/class/backlight. If it's an ACPI device it has a symlink pointing to some random ACPI device. If it's a platform device it's pointing at something like "dell-laptop" which is approximately unhelpful when it comes to figuring out what it controls. If it's a GPU-level device then it probably points at the PCI device, which is helpful except in the case where you have multiple backlight controls on a single GPU. So, by and large, you have no good way to identify which backlight control is preferable unless you keep a huge list of all possible backlights along with some scoring.

The first thing I've added to improve this is a "type" attribute. This tells you whether a given backlight is firmware-level (like ACPI), platform-level (like the various laptop drivers) or performs raw register writes (like a GPU driver). That lets userspace decide which interface is preferable. It'll typically be the ACPI interface, because that's the most likely to keep synchronisation and so avoid bizarre brightness bugs. The next thing has been to start fixing up the parent links. There's nothing we can do for the platform level devices, but the ACPI drivers could at least point at PCI devices rather than into ACPI space. That means that multi-GPU systems can now identify which interface to use based on the currently active GPU. Finally, I've started pointing the GPU-level backlight controls at the specific output rather than merely at the PCI device. This probably makes little difference for laptops as such, but once we start exposing backlight control for monitors that support ddcci it'll make things much easier as we'll know which backlight control corresponds to which monitor.

I've then written a small library that accepts information about the output and picks the "best" backlight for the device. It's obviously based on a pile of heuristics and there's a couple of bits of API that I suspect need to be nailed down yet, but it means that this code only needs to be written once. It's then simple to glue this into X drivers, which means that they can expose a "Backlight" xrandr property on each relevant display. That means that backlight control is then handled at the session level with the X server acting as the privileged agent, which simplifies a bunch of things and means we can finally let hal die entirely. Long-term this means we'll have unified backlight control for all of your displays, which is a wonderful thing.

Summary: We kind of suck right now, but there's a reasonably clear path to getting better.

Syndicated 2010-09-09 21:43:34 from Matthew Garrett


The last few months have been busier than I expected, resulting in various failures to get stuff done. The good news is that things are a little more relaxed now and I'm gradually catching up, but if you've emailed me and I haven't replied then you should probably do so again.

A couple of updates - the source code for the Augen Android tablets I wrote about still hasn't been released, but the vendor does seem to be doing their best. Their supplier seems to be refusing to hand over the source code (they were given a tarball that was supposed to be it, but in fact just turned out to be the various GPLed bits of Android) and they're obviously stuck between a rock and a hard place. The obvious observation is that they should have done due diligence before starting to ship these things, but given that they're out in the wild and they're trying to improve things it seems reasonable to carry on working to try to obtain the source rather than insisting that kmart stop selling them.

Fusion Garage, on the other hand, are still failing to provide source and seem entirely unconcerned about it - they've failed to respond to any of my emails since the first. Augen aren't providing source because they can't, while Fusion Garage aren't providing source because they won't. Irked by this, I've decided to try Don Marti's suggestion and file a case with US Customs. I'll admit that I have absolutely no idea how seriously these cases get taken, and so I've no great expectation of any sort of interesting outcome. But even so, if you're in the US and try to buy a Joojoo then there's a chance that it'll be seized by US customs on the way in.

Syndicated 2010-09-09 13:58:36 from Matthew Garrett


If you get messages like this:

ACPI Error (psparse-0537): Method parse/execution failed [\_SB_._OSC] (Node ffff8801e8c62b30), AE_AML_BUFFER_LIMIT

then you've fallen foul of one of the less appealing aspects of ACPI. _OSC methods are defined as methods to allow the operating system and the firmware to handshake over their support of optional features. Different _OSC methods apply to different types of hardware. CPU _OSC methods allow the operating system to inform the firmware that it supports ACPI 3.0 throttling states, while PCIe _OSC methods allow the operating system to indicate that it can manage native PCIe hotplug. The firmware can choose whether or not to give up control of these features, and the OS then has to cope.

The problem arises when we get to the _OSC method on the system bus. This wasn't specified until ACPI 4.0, leaving an attractive mechanism for vendors to add OS/firmware integration. To that end, we now have at least three different _SB_._OSC methods in the wild:

  1. The ACPI specified _OSC. This exists purely for the OS to tell the firmware what it supports, without the firmware having the opportunity to disagree. As such it passes 8 bytes of data to the method.
  2. The Microsoft WHEA _OSC. This exists to allow the firmware and Windows to handshake over whether or not the firmware supports Microsoft's hardware error reporting.
  3. HP's PCC _OSC method, designed to allow handshaking between the OS and the firmware in order to determine whether they support OS interaction with the firmware-level CPU scaling

This wouldn't be enough to be a problem in itself. The ACPI spec requires _OSC methods to have GUIDs in order to protect users from exactly this kind of situation - Windows can attempt to enable WHEA on a machine with a spec-compliant _OSC, and the _OSC method will return an immediate failure because the GUID doesn't match. Except that the WHEA and PCC versions of _SB_._OSC pass 12 bytes of data in the third argument, against the ACPI spec version's 8. And many _OSC implementations attempt to access the region between bytes 9 and 12 before checking the GUID, resulting in the AE_AML_BUFFER_LIMIT error.

This is made even more annoying due to the fact that argument 2 contains the number of parameters being passed in argument 3, making it straightforward to avoid this kind of failure. Firmware authors, tbh.

Syndicated 2010-08-06 20:44:00 from Matthew Garrett

Kmart Android tablets and the GPL

The Augen Android tablet being sold in Kmart stores at the moment is (shockingly) running a 2.6.29 kernel and Android 2.1 on top of that. It's also (shockingly) currently impossible to get hold of the source code for the kernel - Augen (whose corporate address is a small unit in Florida) say that the software comes installed on the units by the OEM and they don't have any access to the source either. This isn't an excuse, of course, and they say that they hope to have it on their website within the next few days - but even so, it seems that the Android device GPL violation trend is still on course. It'll be interesting to see what the long-term outcome of this kind of violation is, especially with these devices increasingly being sold by mainstream stores.

Syndicated 2010-07-29 21:51:22 from Matthew Garrett

Meego kernel watch

sgx535 drivers in today's Meego kernel tree: 3 (GMA600, CE4100, N900)
sgx535 drivers submitted upstream: 1 (Tungsten GMA500 driver, submitted March 2009, rejected due to significant chunks of functionality there purely to support closed userspace)

To be fair, the rest of the Moorestown support code seems to be shaping up fairly nicely. But the lack of a coherent story about what graphics support is going to look like isn't hugely reassuring.

Syndicated 2010-07-23 19:22:22 from Matthew Garrett

Power management at Plumbers

I'm running a power management track at the Linux Plumbers Conference again this November. Unlike most conferences which focus on presenting completed work, Plumbers is an opportunity to focus on unsolved problems and throw around as many half-baked solutions as you want in order to try to find one that seems to stick. The suspend/resume problem in Linux is mostly solved[1], which means that it's time for us to focus on runtime power management and quality of service.

This has been an especially interesting year in the field. We've landed the infrastructure for generic runtime power management, glued that into PCI and started implementing that at the driver level. pm_qos is being reworked to improve performance and scalability as we start seeing more drivers that need to express their own constraints. And, of course, we had the wakelock/suspend blockers conversation that didn't end in a terribly satisfactory manner, although Rafael is now working on an implementation that presents equivalent functionality with a different userspace API. Runtime full-system suspend isn't solved yet either - the current cpuidle-based solution doesn't work well on multicore systems. And maybe we could be more aggressive still by looking at reclocking more system components on the fly even if the existing interfaces don't allow that. Do we have all the hooks we need to identify which system resources are being used? Are we doing the best we can in terms of avoiding trading off performance for power savings?

So if you'd like to talk about any of these things, or if there's any other problems that you don't think have been solved yet, head on over to the call for submissions and help make sure that we can make Linux the most power-efficient OS possible.

[1] Yes, some machines are broken, but those tend to be individual weird bugs which we're gradually tracking down rather than fundamental issues in our core code, so they're not really in the scope of Plumbers

Syndicated 2010-07-12 13:53:43 from Matthew Garrett


It turns out that it's actually really, really easy to set up an l2tp tunnel. You just need to install xl2tpd, configure some address ranges and then add an authentication entry to chap-secrets. It's just that the entire known universe appears to be more interested in using ipsec as well, and that looks worse than setting up Kerberos and I've already done that enough in my life thanks. I don't care about my connection being encrypted (I've got encrypted protocols for that), so this seems to be an entirely reasonable solution.

Syndicated 2010-06-29 18:41:21 from Matthew Garrett

The paradox of choice

Searching for information on setting up an L2TP VPN takes me here, where I get to choose between OpenSWAN, KAME and some OpenBSD port. Searching for information on setting up a PPTP VPN takes me here, where I'm told exactly what I need to do.

Given choices, I chose the one that reduced my choices. THERE IS A LESSON HERE.

(Sadly, I'm now going to have to deal with L2TP anyway because something in the intermediate network is dropping GRE)

Syndicated 2010-06-29 17:37:04 from Matthew Garrett

256 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!