Older blog entries for mjg59 (starting at number 228)

10 Nov 2009 »

ACPI general purpose events

ACPI is a confusing place. It's often thought of as a suspend/resume thing, though if you're unlucky you've learned that it's also involved in boot-time configuration because it's screwed up your interrupts again. But ACPI's also heavily involved in the runtime management of the system, and it's necessary for there to be a mechanism for the hardware to alert the OS of events.

ACPI handles this case by providing a set of general purpose events (GPEs). The implementation of these is fairly straightforward - an ACPI table points at a defined system resource (typically an area of system io space, though in principle it could be something like mmio instead), and when the hardware fires an ACPI interrupt the kernel looks at this region to see which GPEs are flagged. Then things get more interesting.

The majority of GPEs are implemented in the ACPI tables via methods with names like _Lxx or _Exx. The xx is the number of the GPE in hex, while the leading _L or _E indicates whether the GPE is level- or edge-triggered. If an ACPI interrupt is fired and GPE 0x1D is flagged as being the source of the interrupt, the ACPI interpreter will then look for an _L1D or _E1D method. Upon finding one, it'll execute it. What this method does is entirely up to the firmware - on most HP laptops, GPE 0x1D is hooked up to the lid switch[1] and so executing it will send a notification to the OS that the lid switch has changed state. The OS will then evaluate the state of the lid switch (generally by making another ACPI query) and send the event up to userspace.

How does the lid end up triggering GPE 0x1D? Things get pretty hardware specific at this point. Intel motherboard chipsets have a set of general purpose io (GPIO) lines that can, for the most part[2], be used by the system vendor for anything they want. For a lid switch, one of these lines is hooked to the switch and the BIOS configures the GPIO as an input. Pressing the switch will cause the GPIO line to become active. The GPIO lines are mapped to GPEs in a 1:1 manner, though with an offset of 16 - ie, GPIO 0xd will map to GPE 0x1d. If GPIO 0xd becomes active, GPE 0x1d will be flagged and an ACPI interrupt sent. The ACPI code will then do something to quash the interrupts, such as inverting the polarity of the GPIO[3], as well as send the notification to the OS.

Why are the GPIOs offset by 16 relative to the GPEs? The lower 16 GPEs (again, talking about Intel hardware) have pre-defined purposes[4]. These range from things like "Critically low battery" to "PCIe hotplug event" down to "This device triggered a wakeup". And the latter is what I'm most interested in here.

Various pieces of modern hardware can be placed into power saving states when not in use. The problem with this is that the user experience of having to turn on hardware before you can use it is not a good one, so in order to make this the default behaviour we need the hardware to tell us that something happened that requires us to wake the hardware up.

There's something of a chicken and egg problem here, but thankfully most of the relevant modern hardware has out of band mechanisms to tell us about things going on. The PCI spec defines something called Power Management Events (PME), which are driven by an additional current that's supplied to the hardware even when it's otherwise turned off. On plug-in PCI Express cards, firing a PME generates an interrupt on the root bridge and a native driver can interpret that, but for legacy PCI devices and integrated chipset devices the notification has to come via ACPI.

The example I've been working on is USB. It's a good choice for various reasons - firstly, there's already support for detecting when the USB controller is idle. Secondly, modern USB host controllers have support for generating PMEs on device insertion, removal or (and this is important) remote wakeup. In other words, as long as the USB bus is idle we can power down the entire USB controller. If the OS tries to access a USB device, we'll power it back up. If the user unplugs or plugs a device, we'll power it back up. If a previously idle device suddenly responds to some external input, we'll power it back up. And it's all nicely invisible to the user.

How does this work? The controller retains a small amount of power even when nominally pwoered down. This is used to keep the detection circuitry alive. When it receives a wakeup event, it asserts the PME line. The chipset detects this and fires a GPE. The OS runs this GPE and receives a device notification on the ACPI representation of the USB controller, telling us to power it back up. We do so and process whatever woke us - if the bus then goes idle again, we can power down once more.

The astonishing thing is that this all works. The only problem we have is that it relies on the machine vendor to have provided the ACPI methods that are associated with the GPEs. If they haven't, we can't enable this functionality - even though the hardware is capable of generating the GPEs, we have no method to execute to let us know which device has to be woken up. The GPE is never answered, we never acknowledge the PME and the hardware keeps on screaming for attention without getting any. And, more to the point, it never gets powered up and your mouse doesn't work.

There's a pretty gross hack to deal with this. In general, we know what the GPE to device mappings are - they're pretty static across Intel chipsets, and while AMD ones can be programmed differently by the BIOS we can read that information back and set up a mapping ourselves. This trick also comes in handy when some vendors (like, say, Dell) manage to implement one of the GPE events wrongly. Everything looks like it should work, but the method never sends a notification because it's buggy. In that case we can unregister the existing method and implement our own instead.

This code isn't upstream yet, but patches have been posted to the linux-acpi mailing list and with luck it'll be there in the 2.6.33 timeframe. My tests suggest about 0.2W saving per machine, which isn't going to save all that many polar bears but seems worth it anyway.

[1] _L1D = lid. Sigh.

[2] There's a few that are reserved for specific purposes

[3] So where before it had to be high to be active, it now has to be low to be active - this means that it'll now trigger on the switch being opened rather than closed, so you'll get another event when you open the lid again.

[4] You can find a list in the documentation for the appropriate ICH chip - the relevant section is "GPE0_STS" under the LPC interface chapter.

Syndicated 2009-11-10 03:08:36 from Matthew Garrett

9 Nov 2009 »

Looking to the past

It’s an oft-voiced suggestion that rather than looking at the bad things that happen in our communities, we should focus on the good things. There’s a number of highly successful geek women already – should we not be concentrating on encouraging more of them, rather than scaring people away with tales of thoughtlessness, discrimination and outright abuse?

Let’s draw an analogy. One day, a $20 charge appears on your credit card. You didn’t make it. You report it to your credit card company, who assure you that they take fraud seriously and then do nothing. A few days later, another $20 charge. Your credit card company tells you that such events are rare, unrepresentative of the general credit card experience and continue to do nothing. A week afterwards, another charge. This time your credit card company describes how they’re planning on implementing a brand new anti-fraud system, but that this is unrelated to any events that may currently be occuring and will give no details as to when it’s going to be rolled out. And proceed to ignore any further reports you make about fraudulant transactions.

Would you stay with this company? Or would you take your business somewhere else?

The problem with the “Let’s look to the future rather than spending too much time getting stuck in the present” argument is that it assures people that things will get better without providing a roadmap for getting there. It does nothing to validate their concerns or make them feel wanted within a community. It assumes either that people will stick with a community that doesn’t respond to their complaints, or that it’s possible to construct a community that’s welcome to an assortment of genders, ethnicities and lifestyles without any of those people being represented in the first place.

Ignoring people’s concerns is an excellent way to drive them away from your community. Doing so because of a potential future that’s probably conditional on you having those people in your community is short sighted and self defeating. Ignoring the present doesn’t benefit the future. It benefits the status quo.

(Originally posted here)

Syndicated 2009-11-09 20:56:21 from Matthew Garrett

28 Oct 2009 »

More GMA500

But is Intel really the party at fault, here?

For shipping a gpu without open drivers? Given that the alternatives involve someone else designing, fabbing and releasing a piece of hardware under Intel's name without being sued in the process, I'm going to have to say "Yes".

(Note that while Moblinzone.com is a website owned by Intel, the writers don't appear to be Intel employees)

Syndicated 2009-10-28 18:05:16 from Matthew Garrett

13 Oct 2009 »

Asymmetries in offence

I wasn't going to write about this since I thought that Chris's post covered pretty much everything I would have said, but after reading Scott's entry on how people would have interpreted Mark's remarks differently if he'd said "We'll have less trouble explaining to boys what we actually do" instead I realised that people are still confused about the fundamental issue here.

The assumption that Scott's making is that "girls" and "boys" are semantically equivalent in this case. They're not. There's various ways in which the symmetry is broken, but the most basic one is that Mark's a straight man. When the overwhelming stereotype is that "we" as a community are heterosexual males, using "we" as a shorthand for "People who are straight men" is unfortunate because it supports that stereotype. Using "we" as a shorthand for "People who are attracted to men" doesn't. Unsurprisingly, this results in a fairly significant change in who's going to be offended.

Whatever his intentions (and I could easily believe that it was a slip of the tongue), Mark managed to imply that the Linux community is entirely made up of straight men. This is possible because straight men do make up the majority of the Linux community. In contrast, Scott's version doesn't succeed in implying that everyone in the Linux community is attracted to men because it's blatantly obviously not the case, so we know that Scott is using "we" in a different manner. Context is important, and unless you can invert everything else about the situation as well then simply replacing the word "girls" with "boys" doesn't give you any meaningful insight into whether or not people are justifiably offended.

In a more general sense, I'm saddened by this case because I think it's a clear case where the Ubuntu code of conduct could have been used to good effect. "Be excellent to each other"[1] ought to include accepting that you've offended other people without meaning to and making appropriate restitution. If the offence was unintended, an apology should be cheap. Whatever the reality of the situation, failing to provide that apology gives people the impression that either the offence was intended or that Mark doesn't care about those who were offended. That's not a good way to build an inclusive community.

[1] Mako's original summary of the code of conduct

Syndicated 2009-10-12 18:30:07 from Matthew Garrett

24 Sep 2009 »

Intel IGD opregion and GMA500

A while back, Intel defined a specification for binding ACPI-defined methods for controlling hardware to the OS-specific driver, ensuring that the two don't get out of synchronisation. I added support for this to the in-kernel i915 driver last year, and after a couple of awkwardnesses it works well now. One consequence of this that showed up slightly later is that it's necessary to do some of the setup from the i915 driver rather than the ACPI driver, which meant that we had to defer the ACPI driver from binding until the drm driver had done that setup.

The problem with GMA500 is that it also implements the IGD Opregion spec, and the ACPI video driver detects this and refuses to bind. But the GMA500 kernel driver doesn't implement support for the spec and so doesn't call the function that triggers the ACPI video registration. Working around this is simple - just add acpi_video_register() to the init function of the GMA500 drm. But note that this means that you're failing to implement the spec properly, and there's potential for stuff to be broken. A full implementation of the spec for GMA500 wouldn't be especially difficult, but there's no docs and I have no hardware so I'm not going to do it myself.

The reason I bring this up is that various people have been approaching this problem in a different way. It's easy to assume that the check in the acpi driver was naively assuming that all Intel hardware was driven by i915 and that this patch was broken. It's actually entirely correct and the (out of tree) GMA500 driver was broken. If Intel had made the effort to get their code properly upstream, it'd have been fixed there when the original change was made and nobody would ever have had a problem. Just say no to out of tree drivers.

Syndicated 2009-09-24 18:24:23 from Matthew Garrett

17 Sep 2009 »

Portland

I'm off to Boston in under 16 hours, and I'll be getting into Portland around lunchtime on Monday. I'll be talking at Linuxcon about how we're broadening power management on Linux to be applicable from phones through netbooks up to supercomputers - that's 10:15 on Tuesday. At 10AM on Friday I'll be presenting at the Linux Plumbers conference on how userspace can express its requirements to the kernel more clearly, thereby allowing the kernel to be smarter about powering down hardware. And after a short hop down to SF for the weekend, I'll be back in Portland at the X developers conference talking about the role of X in providing relevant information to the kernel and using that to facilitate more aggressive power management.

Three talks in under 10 days. I'll even do my best to ensure that there's new jokes for each of them.

Syndicated 2009-09-17 00:04:36 from Matthew Garrett

15 Sep 2009 »

Bye

I'm moving to the US on Thursday, so I will be here on Wednesday evening from about 7. If your presence is unlikely to make me stupifyingly angry, feel free to join me.

Syndicated 2009-09-15 11:16:36 from Matthew Garrett

7 Sep 2009 »

Good thing: Vicodin
Bad thing: The broken arm that necessitates the vicodin

Syndicated 2009-09-07 14:13:20 from Matthew Garrett

25 Aug 2009 »

I moved out today, and in a bit over a month should be firmly relocated in Boston. Which means a bit over a month of living and working out of bags, but I'm in a park, it's sunny and I'm relaxed enough that even working through Bugzilla isn't making me sad.

Syndicated 2009-08-25 11:31:12 from Matthew Garrett

11 Aug 2009 »

Defective by Design

You know what's Defective by Design? Thinking that this kind of functionality is a good thing, resulting in this.

Syndicated 2009-08-11 20:18:04 from Matthew Garrett

219 older entries...