Sigh.
If only eeepc-laptop sent standard keycodes, or something.Legacy PC design misery
I've spent chunks of the last couple of days fighting a problem that's existed for about 25 years. The 8086 was a 16-bit processor with a 20-bit address space, limiting the maximum physical address that could be accessed to 1MB. However, quirks of the segmented memory system meant that addresses greater than 1MB could be constructed - these would wrap around to the bottom of memory. Because loading the segment registers was a time consuming operation, some programmers used this behaviour as a performance optimisation.The ACPI Embedded Controller
Of course, the event model I described before is far too simple to be worthy of a place in the ACPI spec. At the most basic level, there's more possible events than there are GPEs to attach them to, so there's a need for some further complexity. This manifests itself in the form of the ACPI embedded controller (EC).ACPI general purpose events
ACPI is a confusing place. It's often thought of as a suspend/resume thing, though if you're unlucky you've learned that it's also involved in boot-time configuration because it's screwed up your interrupts again. But ACPI's also heavily involved in the runtime management of the system, and it's necessary for there to be a mechanism for the hardware to alert the OS of events.ACPI handles this case by providing a set of general purpose events (GPEs). The implementation of these is fairly straightforward - an ACPI table points at a defined system resource (typically an area of system io space, though in principle it could be something like mmio instead), and when the hardware fires an ACPI interrupt the kernel looks at this region to see which GPEs are flagged. Then things get more interesting.
The majority of GPEs are implemented in the ACPI tables via methods with names like _Lxx or _Exx. The xx is the number of the GPE in hex, while the leading _L or _E indicates whether the GPE is level- or edge-triggered. If an ACPI interrupt is fired and GPE 0x1D is flagged as being the source of the interrupt, the ACPI interpreter will then look for an _L1D or _E1D method. Upon finding one, it'll execute it. What this method does is entirely up to the firmware - on most HP laptops, GPE 0x1D is hooked up to the lid switch[1] and so executing it will send a notification to the OS that the lid switch has changed state. The OS will then evaluate the state of the lid switch (generally by making another ACPI query) and send the event up to userspace.
How does the lid end up triggering GPE 0x1D? Things get pretty hardware specific at this point. Intel motherboard chipsets have a set of general purpose io (GPIO) lines that can, for the most part[2], be used by the system vendor for anything they want. For a lid switch, one of these lines is hooked to the switch and the BIOS configures the GPIO as an input. Pressing the switch will cause the GPIO line to become active. The GPIO lines are mapped to GPEs in a 1:1 manner, though with an offset of 16 - ie, GPIO 0xd will map to GPE 0x1d. If GPIO 0xd becomes active, GPE 0x1d will be flagged and an ACPI interrupt sent. The ACPI code will then do something to quash the interrupts, such as inverting the polarity of the GPIO[3], as well as send the notification to the OS.
Why are the GPIOs offset by 16 relative to the GPEs? The lower 16 GPEs (again, talking about Intel hardware) have pre-defined purposes[4]. These range from things like "Critically low battery" to "PCIe hotplug event" down to "This device triggered a wakeup". And the latter is what I'm most interested in here.
Various pieces of modern hardware can be placed into power saving states when not in use. The problem with this is that the user experience of having to turn on hardware before you can use it is not a good one, so in order to make this the default behaviour we need the hardware to tell us that something happened that requires us to wake the hardware up.
There's something of a chicken and egg problem here, but thankfully most of the relevant modern hardware has out of band mechanisms to tell us about things going on. The PCI spec defines something called Power Management Events (PME), which are driven by an additional current that's supplied to the hardware even when it's otherwise turned off. On plug-in PCI Express cards, firing a PME generates an interrupt on the root bridge and a native driver can interpret that, but for legacy PCI devices and integrated chipset devices the notification has to come via ACPI.
The example I've been working on is USB. It's a good choice for various reasons - firstly, there's already support for detecting when the USB controller is idle. Secondly, modern USB host controllers have support for generating PMEs on device insertion, removal or (and this is important) remote wakeup. In other words, as long as the USB bus is idle we can power down the entire USB controller. If the OS tries to access a USB device, we'll power it back up. If the user unplugs or plugs a device, we'll power it back up. If a previously idle device suddenly responds to some external input, we'll power it back up. And it's all nicely invisible to the user.
How does this work? The controller retains a small amount of power even when nominally pwoered down. This is used to keep the detection circuitry alive. When it receives a wakeup event, it asserts the PME line. The chipset detects this and fires a GPE. The OS runs this GPE and receives a device notification on the ACPI representation of the USB controller, telling us to power it back up. We do so and process whatever woke us - if the bus then goes idle again, we can power down once more.
The astonishing thing is that this all works. The only problem we have is that it relies on the machine vendor to have provided the ACPI methods that are associated with the GPEs. If they haven't, we can't enable this functionality - even though the hardware is capable of generating the GPEs, we have no method to execute to let us know which device has to be woken up. The GPE is never answered, we never acknowledge the PME and the hardware keeps on screaming for attention without getting any. And, more to the point, it never gets powered up and your mouse doesn't work.
There's a pretty gross hack to deal with this. In general, we know what the GPE to device mappings are - they're pretty static across Intel chipsets, and while AMD ones can be programmed differently by the BIOS we can read that information back and set up a mapping ourselves. This trick also comes in handy when some vendors (like, say, Dell) manage to implement one of the GPE events wrongly. Everything looks like it should work, but the method never sends a notification because it's buggy. In that case we can unregister the existing method and implement our own instead.
This code isn't upstream yet, but patches have been posted to the linux-acpi mailing list and with luck it'll be there in the 2.6.33 timeframe. My tests suggest about 0.2W saving per machine, which isn't going to save all that many polar bears but seems worth it anyway.
[1] _L1D = lid. Sigh.
[2] There's a few that are reserved for specific purposes
[3] So where before it had to be high to be active, it now has to be low to be active - this means that it'll now trigger on the switch being opened rather than closed, so you'll get another event when you open the lid again.
[4] You can find a list in the documentation for the appropriate ICH chip - the relevant section is "GPE0_STS" under the LPC interface chapter.
Looking to the past
It’s an oft-voiced suggestion that rather than looking at the bad things that happen in our communities, we should focus on the good things. There’s a number of highly successful geek women already – should we not be concentrating on encouraging more of them, rather than scaring people away with tales of thoughtlessness, discrimination and outright abuse?More GMA500
But is Intel really the party at fault, here?Asymmetries in offence
I wasn't going to write about this since I thought that Chris's post covered pretty much everything I would have said, but after reading Scott's entry on how people would have interpreted Mark's remarks differently if he'd said "We'll have less trouble explaining to boys what we actually do" instead I realised that people are still confused about the fundamental issue here.Intel IGD opregion and GMA500
A while back, Intel defined a specification for binding ACPI-defined methods for controlling hardware to the OS-specific driver, ensuring that the two don't get out of synchronisation. I added support for this to the in-kernel i915 driver last year, and after a couple of awkwardnesses it works well now. One consequence of this that showed up slightly later is that it's necessary to do some of the setup from the i915 driver rather than the ACPI driver, which meant that we had to defer the ACPI driver from binding until the drm driver had done that setup.Portland
I'm off to Boston in under 16 hours, and I'll be getting into Portland around lunchtime on Monday. I'll be talking at Linuxcon about how we're broadening power management on Linux to be applicable from phones through netbooks up to supercomputers - that's 10:15 on Tuesday. At 10AM on Friday I'll be presenting at the Linux Plumbers conference on how userspace can express its requirements to the kernel more clearly, thereby allowing the kernel to be smarter about powering down hardware. And after a short hop down to SF for the weekend, I'll be back in Portland at the X developers conference talking about the role of X in providing relevant information to the kernel and using that to facilitate more aggressive power management.Bye
I'm moving to the US on Thursday, so I will be here on Wednesday evening from about 7. If your presence is unlikely to make me stupifyingly angry, feel free to join me.FOAF updates: Trust rankings are now exported, making the data available to other users and websites. An external FOAF URI has been added, allowing users to link to an additional FOAF file.
Keep up with the latest Advogato features by reading the Advogato status blog.
If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!