ACPI general purpose events
ACPI is a confusing place. It's often thought of as a suspend/resume
thing, though if you're unlucky you've learned that it's also involved
in boot-time configuration because it's screwed up your interrupts
again. But ACPI's also heavily involved in the runtime management of
the system, and it's necessary for there to be a mechanism for the
hardware to alert the OS of events.
ACPI handles this case by providing a set of general purpose events
(GPEs). The implementation of these is fairly straightforward - an
ACPI table points at a defined system resource (typically an area of
system io space, though in principle it could be something like mmio
instead), and when the hardware fires an ACPI interrupt the kernel
looks at this region to see which GPEs are flagged. Then things get
more interesting.
The majority of GPEs are implemented in the ACPI tables via methods
with names like _Lxx or _Exx. The xx is the number of the GPE in hex,
while the leading _L or _E indicates whether the GPE is level- or
edge-triggered. If an ACPI interrupt is fired and GPE 0x1D is flagged
as being the source of the interrupt, the ACPI interpreter will then
look for an _L1D or _E1D method. Upon finding one, it'll execute
it. What this method does is entirely up to the firmware - on most HP
laptops, GPE 0x1D is hooked up to the lid switch[1] and so executing
it will send a notification to the OS that the lid switch has changed
state. The OS will then evaluate the state of the lid switch
(generally by making another ACPI query) and send the event up to
userspace.
How does the lid end up triggering GPE 0x1D? Things get pretty
hardware specific at this point. Intel motherboard chipsets have a set
of general purpose io (GPIO) lines that can, for the most part[2], be
used by the system vendor for anything they want. For a lid switch,
one of these lines is hooked to the switch and the BIOS configures the
GPIO as an input. Pressing the switch will cause the GPIO line to
become active. The GPIO lines are mapped to GPEs in a 1:1 manner,
though with an offset of 16 - ie, GPIO 0xd will map to GPE 0x1d. If
GPIO 0xd becomes active, GPE 0x1d will be flagged and an ACPI
interrupt sent. The ACPI code will then do something to quash the
interrupts, such as inverting the polarity of the GPIO[3], as well as
send the notification to the OS.
Why are the GPIOs offset by 16 relative to the GPEs? The lower 16 GPEs
(again, talking about Intel hardware) have pre-defined
purposes[4]. These range from things like "Critically low battery" to
"PCIe hotplug event" down to "This device triggered a wakeup". And the
latter is what I'm most interested in here.
Various pieces of modern hardware can be placed into power saving
states when not in use. The problem with this is that the user
experience of having to turn on hardware before you can use it is not
a good one, so in order to make this the default behaviour we need the
hardware to tell us that something happened that requires us to wake
the hardware up.
There's something of a chicken and egg problem here, but thankfully
most of the relevant modern hardware has out of band mechanisms to
tell us about things going on. The PCI spec defines something called
Power Management Events (PME), which are driven by an additional
current that's supplied to the hardware even when it's otherwise
turned off. On plug-in PCI Express cards, firing a PME generates an
interrupt on the root bridge and a native driver can interpret that,
but for legacy PCI devices and integrated chipset devices the
notification has to come via ACPI.
The example I've been working on is USB. It's a good choice for
various reasons - firstly, there's already support for detecting when
the USB controller is idle. Secondly, modern USB host controllers have
support for generating PMEs on device insertion, removal or (and this
is important) remote wakeup. In other words, as long as the USB bus is
idle we can power down the entire USB controller. If the OS tries to
access a USB device, we'll power it back up. If the user unplugs or
plugs a device, we'll power it back up. If a previously idle device
suddenly responds to some external input, we'll power it back up. And
it's all nicely invisible to the user.
How does this work? The controller retains a small amount of power
even when nominally pwoered down. This is used to keep the detection
circuitry alive. When it receives a wakeup event, it asserts the PME
line. The chipset detects this and fires a GPE. The OS runs this GPE
and receives a device notification on the ACPI representation of the
USB controller, telling us to power it back up. We do so and process
whatever woke us - if the bus then goes idle again, we can power down
once more.
The astonishing thing is that this all works. The only problem we have
is that it relies on the machine vendor to have provided the ACPI
methods that are associated with the GPEs. If they haven't, we can't
enable this functionality - even though the hardware is capable of
generating the GPEs, we have no method to execute to let us know which
device has to be woken up. The GPE is never answered, we never
acknowledge the PME and the hardware keeps on screaming for attention
without getting any. And, more to the point, it never gets powered up
and your mouse doesn't work.
There's a pretty gross hack to deal with this. In general, we know
what the GPE to device mappings are - they're pretty static across
Intel chipsets, and while AMD ones can be programmed differently by
the BIOS we can read that information back and set up a mapping
ourselves. This trick also comes in handy when some vendors (like,
say, Dell) manage to implement one of the GPE events
wrongly. Everything looks like it should work, but the method never
sends a notification because it's buggy. In that case we can
unregister the existing method and implement our own instead.
This code isn't upstream yet, but patches have been posted to the
linux-acpi mailing list and with luck it'll be there in the 2.6.33
timeframe. My tests suggest about 0.2W saving per machine, which isn't
going to save all that many polar bears but seems worth it anyway.
[1] _L1D = lid. Sigh.
[2] There's a few that are reserved for specific purposes
[3] So where before it had to be high to be active, it now has to be
low to be active - this means that it'll now trigger on the switch
being opened rather than closed, so you'll get another event when you
open the lid again.
[4] You can find a list in the documentation for the appropriate ICH
chip - the relevant section is "GPE0_STS" under the LPC interface
chapter.
Syndicated 2009-11-10 03:08:36 from Matthew Garrett