10 Nov 2009 mjg59   » (Master)

ACPI general purpose events

ACPI is a confusing place. It's often thought of as a suspend/resume thing, though if you're unlucky you've learned that it's also involved in boot-time configuration because it's screwed up your interrupts again. But ACPI's also heavily involved in the runtime management of the system, and it's necessary for there to be a mechanism for the hardware to alert the OS of events.

ACPI handles this case by providing a set of general purpose events (GPEs). The implementation of these is fairly straightforward - an ACPI table points at a defined system resource (typically an area of system io space, though in principle it could be something like mmio instead), and when the hardware fires an ACPI interrupt the kernel looks at this region to see which GPEs are flagged. Then things get more interesting.

The majority of GPEs are implemented in the ACPI tables via methods with names like _Lxx or _Exx. The xx is the number of the GPE in hex, while the leading _L or _E indicates whether the GPE is level- or edge-triggered. If an ACPI interrupt is fired and GPE 0x1D is flagged as being the source of the interrupt, the ACPI interpreter will then look for an _L1D or _E1D method. Upon finding one, it'll execute it. What this method does is entirely up to the firmware - on most HP laptops, GPE 0x1D is hooked up to the lid switch[1] and so executing it will send a notification to the OS that the lid switch has changed state. The OS will then evaluate the state of the lid switch (generally by making another ACPI query) and send the event up to userspace.

How does the lid end up triggering GPE 0x1D? Things get pretty hardware specific at this point. Intel motherboard chipsets have a set of general purpose io (GPIO) lines that can, for the most part[2], be used by the system vendor for anything they want. For a lid switch, one of these lines is hooked to the switch and the BIOS configures the GPIO as an input. Pressing the switch will cause the GPIO line to become active. The GPIO lines are mapped to GPEs in a 1:1 manner, though with an offset of 16 - ie, GPIO 0xd will map to GPE 0x1d. If GPIO 0xd becomes active, GPE 0x1d will be flagged and an ACPI interrupt sent. The ACPI code will then do something to quash the interrupts, such as inverting the polarity of the GPIO[3], as well as send the notification to the OS.

Why are the GPIOs offset by 16 relative to the GPEs? The lower 16 GPEs (again, talking about Intel hardware) have pre-defined purposes[4]. These range from things like "Critically low battery" to "PCIe hotplug event" down to "This device triggered a wakeup". And the latter is what I'm most interested in here.

Various pieces of modern hardware can be placed into power saving states when not in use. The problem with this is that the user experience of having to turn on hardware before you can use it is not a good one, so in order to make this the default behaviour we need the hardware to tell us that something happened that requires us to wake the hardware up.

There's something of a chicken and egg problem here, but thankfully most of the relevant modern hardware has out of band mechanisms to tell us about things going on. The PCI spec defines something called Power Management Events (PME), which are driven by an additional current that's supplied to the hardware even when it's otherwise turned off. On plug-in PCI Express cards, firing a PME generates an interrupt on the root bridge and a native driver can interpret that, but for legacy PCI devices and integrated chipset devices the notification has to come via ACPI.

The example I've been working on is USB. It's a good choice for various reasons - firstly, there's already support for detecting when the USB controller is idle. Secondly, modern USB host controllers have support for generating PMEs on device insertion, removal or (and this is important) remote wakeup. In other words, as long as the USB bus is idle we can power down the entire USB controller. If the OS tries to access a USB device, we'll power it back up. If the user unplugs or plugs a device, we'll power it back up. If a previously idle device suddenly responds to some external input, we'll power it back up. And it's all nicely invisible to the user.

How does this work? The controller retains a small amount of power even when nominally pwoered down. This is used to keep the detection circuitry alive. When it receives a wakeup event, it asserts the PME line. The chipset detects this and fires a GPE. The OS runs this GPE and receives a device notification on the ACPI representation of the USB controller, telling us to power it back up. We do so and process whatever woke us - if the bus then goes idle again, we can power down once more.

The astonishing thing is that this all works. The only problem we have is that it relies on the machine vendor to have provided the ACPI methods that are associated with the GPEs. If they haven't, we can't enable this functionality - even though the hardware is capable of generating the GPEs, we have no method to execute to let us know which device has to be woken up. The GPE is never answered, we never acknowledge the PME and the hardware keeps on screaming for attention without getting any. And, more to the point, it never gets powered up and your mouse doesn't work.

There's a pretty gross hack to deal with this. In general, we know what the GPE to device mappings are - they're pretty static across Intel chipsets, and while AMD ones can be programmed differently by the BIOS we can read that information back and set up a mapping ourselves. This trick also comes in handy when some vendors (like, say, Dell) manage to implement one of the GPE events wrongly. Everything looks like it should work, but the method never sends a notification because it's buggy. In that case we can unregister the existing method and implement our own instead.

This code isn't upstream yet, but patches have been posted to the linux-acpi mailing list and with luck it'll be there in the 2.6.33 timeframe. My tests suggest about 0.2W saving per machine, which isn't going to save all that many polar bears but seems worth it anyway.

[1] _L1D = lid. Sigh.

[2] There's a few that are reserved for specific purposes

[3] So where before it had to be high to be active, it now has to be low to be active - this means that it'll now trigger on the switch being opened rather than closed, so you'll get another event when you open the lid again.

[4] You can find a list in the documentation for the appropriate ICH chip - the relevant section is "GPE0_STS" under the LPC interface chapter.

Syndicated 2009-11-10 03:08:36 from Matthew Garrett

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!