Older blog entries for mjg59 (starting at number 246)

PCI power management problems

I've previously written about some of the implementation details of runtime PCI power management. One of the key aspects for PCI devices is that the PME line on the device be able to generate a wakeup event. Unfortunately, it turns out that since Windows makes no use of this functionality at present, some vendors have started failing to wire this up. This is problematic because the device itself still announces PME support and will even raise a PME signal when it gets woken up - but it's pretty much screaming as hard as it can and nobody's listening because somebody's replaced the air with cheese.

Bother, etc.

The obvious question is what to do next. We could poll all the PCI devices every second or so to see if there's a PME status register that's been set. This would be reasonably cheap (we're going to wake up at least once a second on x86 anyway, so it makes no real difference to power consumption) but would introduce obvious latency in the wakeup path. This would be fine for certain types of situation (you're probably not going to be too sad if your SD reader takes a second longer to notice a card insertion) but not others (if an sdio wireless card generates a wakeup interrupt then we really ought to do something about it in a more sensible timeframe). A second is to force users to pass a boottime argument to enable this at all. That kind of sucks.

The third possibility is limited to a subset of hardware, but does allow the introduction of some kind of elegance into what would otherwise be one of those nightmarish scenarios that makes me consider taking up farming instead. If we can trigger a PME ourselves then we can test whether the line is connected. This doesn't appear to be possible on SDHCI but the spec for firewire makes it look like we can do it there - one of the interrupt sources is the port enable/disable register, and we can hit toggle that ourselves. I haven't actually tested this yet, but if it works that would let us make this determination.

It's depressing that doing anything interesting with power management is still heavily determined by what Microsoft have bothered to implement - to the extent that on some machines (hello, Thinkpads) there are no GPE methods at all for PME signals and you don't get runtime power management at all. The only thing that saves us is that (a) it's pretty hard to be able to screw up stuff that's already all glued into one chip package, so integrated stuff like USB should be fine anyway, and (b) PCIe does this as part of the normal traffic stream so there's no way to get that wrong. Unless you're still missing the GPE methods for the signals (hello, Thinkpads) in which case you get nothing unless the native PCIe signalling works. I suspect that we can always force that, so there's some hope yet.

In summary, then: dispassionate.

Syndicated 2010-05-03 21:25:26 from Matthew Garrett

Radeon reclocking update

The code I mentioned here is now all in the drm-radeon-testing branch of drm-2.6.git.

Syndicated 2010-04-28 13:07:05 from Matthew Garrett

Radeon reclocking

Alex released another set of Radeon power management patches over the weekend, and I've been adding my own code on top of that (Alex's patches go on top of drm-next, mine go on top of there). I've left it stress-testing for a couple of hours without it falling over, which tells me that it's stable enough that I can feel smug. This is a pleasing counterpoint to the previous experiences I've been having, which have been rife with a heady mixture of chip lockups or kernel deadlocks. It turns out that gpus are hard.

There's a few things you need to know about gpus. The first is that if they're discrete devices they typically have their own memory controller and video memory. The second is that there's an impressive number of ways that you can end up touching that memory. The third is that they tend to get upset if something tries to touch that memory and the memory controller is in the middle of reclocking at the time.

The first and most obvious use of video memory is by the gpu itself. Accelerated operations on radeon are carried out by sending a command packet to the command processor. This is achieved by sharing a ring buffer between the cpu and the gpu, with the gpu reading packets out of that ring buffer and performing the operations contained within them. Many of these operations will touch video memory (that being the nature of most things you want a gpu to do), and if that happens bad things occur. Like the card locking up and potentially taking your PCI bus with it.

So, obviously, we don't want that to happen. The first thing we do is take a mutex that blocks any further accelerated operations from being submitted by userspace. Then we wait until we get an interrupt from the gpu telling us that the display engine has gone idle. The problem here is that we don't have a terribly good idea of how many more operations there are to complete and we don't know how long each of those operations is going to take, but this is less bad than some of the alternatives[1]. Jerome Glisse has some ideas on how to improve this to require less waiting, but the effects should still be pretty much invisible to the average user.

So we've stopped the command processor touching ram. Everything's good, right?

Well, not really. The obvious problem is that users typically want to display something, so there's a separate chunk of chip that's repeatedly copying video memory over to your monitor. That's got to go too. Thankfully, there's a convenient bit in the crtc registers that lets us turn that off, but the pretty unsurprising downside is that your screen goes blank while that's happening. So we don't want to do that. Instead, we try to perform the reclock while there's nothing being displayed on the screen - that is, while we're in the screen region where a crt's electron gun would be scanning back from the bottom of the screen to the top. It turns out that rather a lot of display assumptions depend on this happening even if there's no crt, no electron gun and no thick sheet of glass with a decent approximation of vacuum behind it, so we get to do this even if we're displaying to an LVDS. And we have about 400-500 microseconds to do it - an almost generous amount of time.

So we ask the hardware to generate an interrupt when we enter vblank and then we reclock. Except the hardware has an irritating habit of lying - sometimes we get the interrupt a line or two before vblank, sometimes we get it after we've already gone out the other side. Vexing, and not entirely solved yet - so sometimes you'll still get a single blank frame during reclock. But there are plans, and they'll probably even work.

At this point the acceleration hardware isn't touching the memory and the scanout hardware isn't touching the memory. Except it still crashes under some workloads. This one took me longer to track down, but the answer turned out to be pretty straightforward. Not all operations are accelerated. When they're not accelerated they have to be done in software. That means that the CPU has to write to the video memory itself. I'm sure you can see where this is going. This was fixed without too much trouble once I'd finished picking through the driver to work out every location where objects might be mapped into the CPU's address space, at which point it's a simple matter of unmapping them and blocking the fault handler from remapping them until the reclock is finished. Linux, thankfully, has lots of synchronisation primitives. And now everything works.

Except when it doesn't. This took a final period of head scratching, followed by the discovery that ttm (the memory allocator used by radeon) has a background thread that would occasionally fire to clean up objects. And by clean up objects, I mean change their mapping - which means updating their status in the gart, which means touching video memory. So, let's block that. And that tripped me off to the fact that even if it couldn't submit new commands, the CPU could still create or destroy objects - with the same consequences.

So, once all of these are blocked, video memory is quiescent and we can do what we want. And we do, at least once I'd sorted out the bits where I was taking locks in the wrong order and deadlocking. Depending on the powerplay tables available on your card we'll chose different rates and so your power savings will vary heavily depending on the values that your vendor provided, but the card I'm testing on sees a handy 30W drop at idle. Right now we're only changing clocks and not dropping voltage so there's potentially a little more to come.

While getting this stable was pretty miserable, the documented entry points for clock changing made a lot of this easier than it would otherwise have been. It's also probably worth noting that Intel's clock configuration registers are entirely missing from any public documentation and the dirver Intel submitted to make them work in their latest chips appeared to have been deliberately obfuscated, so thanks to AMD for making all of this possible.

[1] It's possible to insert various command packets that either indicate when they've passed or stall until a register value gets updated, but these either cause awkward problems with the cp locking or mean that the gui idle functionality never goes idle, so they're not ideal either.

Syndicated 2010-04-27 23:28:02 from Matthew Garrett

Looks like I picked the wrong week to give up crystal meth

One of the features of Windows 7 is that hitting windows+p will pop up a little dialog that allows you to configure your active display outputs. This is an improvement over previous versions of Windows, which would generally instead have a variety of random vendor-specific tools that would function differently, look ugly and make you cry. So, hurrah to Windows for moving into the 21st century.

Most laptops have a display switch key. This is sent in a variety of ways, generally either via the keyboard controller, via WMI or via ACPI. In Linux we take all of these events and turn them into KEY_SWITCHVIDEOMODE, making it easy to implement standardised behaviour.

This is, obviously, far too straight forward.

Microsoft, rather than introducing an input mechanism that allows all of these events to hook into the windows+p infrastructure, provide the following recommendation:

As documented in this Launchpad bug, vendors are starting to do this. It's been seen in HP and Dell machines, at least, and it's presumably going to become more widespread.

So, if your display switch button now just makes the letter "P" appear, say thanks to Microsoft. There's a range of ways we can fix this, although none of them are straightforward and those of you who currently use left-Windows-p as a keyboard shortcut are going to be sad. I would say that I feel your pain, but my current plan is to spend the immediate future getting drunk enough that I stop caring.

(The good news is that the same set of recommendations says that you can no longer put a Windows sticker on a monitor unless it has a valid and accurate EDID. The bad news is that that implies that you've previously been able to put a Windows sticker on a monitor without it having a valid and accurate EDID)

Syndicated 2010-04-22 22:33:51 from Matthew Garrett

Nook update (again)

Barnes and Noble released the nook source code last week. This includes the code to busybox, uboot and their kernel. Unfortunately, the uboot and kernel code both appear to be missing swathes of code found statically linked in the binaries that they're distributing. License compliance is hard, let's flail wildly.

Syndicated 2010-02-26 18:31:26 from Matthew Garrett

You know it's a bad day when:

ld gives you "Can not allocate memory".

(turned out to be a corrupt object file)

Syndicated 2010-02-24 19:21:58 from Matthew Garrett


As I mentioned, I headed to Pittsburgh last week to give some talks at CMU and find out something about what they're doing there. Despite the dire weather that had closed the airport the day before, I had no trouble getting into town and was soon safely in a hotel room with a heater that seemed oddly enthusiastic about blasting cold air at me for ten seconds every fifteen minutes. Unfortunately, it seems that life wasn't as easy for everyone - ten minutes after I arrived, I got a phone call telling me that the city had asked CMU to cancel classes the next day.

This turned out to be much less of a problem than I'd expected - whether because of their enthusiasm to learn about ACPI or because they simply hadn't noticed the alert telling them about the cancellation, a decent body of students turned up the next morning. After a brief chat with Mark Stehlik, the assistant dean for undergraduate education, I headed off to the lecture hall. The fact that I can now just plug my laptop into a VGA cable and have my desktop automatically extend itself continues to amaze me, as does OpenOffice's seemingly unerring ability to get confused about which screen should have my content and which should be showing me the next slide. Nevertheless, facts were imparted and knowledge dropped on those assembled. I'm even reasonably sure that the contents were factually accurate, which is a shame because the most attractive part of teaching always struck me as being able to lie to students who will then happily regurgitate whatever you tell them because in case it turns up on the exam. Perhaps this is why I'm safer out of academia.

Lunch offered an opportunity to visit the Red Hat sponsored lab, which was pleasingly located somewhere other than a basement. The guy on the right of the picture is Greg Kesden, the director of undergraduate laboratories in CS there - it was wonderful to get an opportunity to see the machines getting used, and students seemed genuinely appreciative of the facility.

After lunch I spent a while talking to Satya about the Internet Suspend and Resume project. This is an impressive combination of virtualisation and migration, using a Fedora-based live image to bring up an OS on arbitrary hardware before downloading a machine image and launching it. The majority of the data is pulled in on demand, meaning that initial performance can be slow but ensuring that data is only downloaded if it's needed. When the user is finished, the delta between the original image and the new one can be pushed back to the server while remaining cached on the local machine in case the image is used again.

It's an interesting approach, combining the flexibility of thin clients with the advantages of having actually useful computing power at the local end. There's a few functional awkwardnesses, such as some VMs being unhappy if images are migrated between machines with different CPU features, and it obviously benefits from having significant bandwidth. But the idea of being able to combine the convenience of a floating session with the knowledge that you can still keep copies of your data on you is an attractive one, and I'd love a future where I can move my session between my laptop and a desktop.

After that there was some time to talk to Bill Scherlis and Philip Lehman about the software engineering courses that CMU run. Part of the minor in software engineering includes a course requirement to make a meaningful contribution to an existing software project, from design through to submission and upstream acceptance. I had the opportunity to talk to a couple of the students about this and the differences they found between working with the Mozilla and Chrome communities, which I'll try to write up at some point.

Finally I gave a presentation on Fedora and some of the issues that we face in providing a useful OS when patents and recalcitrant hardware vendors do their best to thwart us. Despite the ice outside and the significantly-below-freezing temperatures, enough people turned up that sorties had to be sent out to find extra chairs. It was great to see how interested people were in learning about what we do, although it's probably the case that the free pizza did help encourage people.

After that it was an early trip back to the airport, where I found that my plane was delayed and the only "restaurant" still open was McDonalds. Even so, I left with the feeling that it had been an interesting and educational visit. Many thanks to David Eckhardt, who runs the OS course I presented to and who looked after me all day - thanks too to Joshua Wise who picked me up when David was running late due to the ground being covered with blocks of ice.

Syndicated 2010-02-19 21:35:36 from Matthew Garrett

Gobi 2000

Anssi Hannula posted a patch to add Gobi 2000 support to qcserial and provided me with support for gobi_loader. I've added the gobi_loader code here. You'll need Anssi's kernel patch from here, and probably also my followup patch with extra IDs from here. Note that the 2000 devices need an extra firmware file (UQCN.mbn) as well as the apps.mbn and amss.mbn files.

The qcserial driver is currently broken in 2.6.32 and later. It's due to the switch to using kfifo for usb serial, but we haven't been able to work out the actual cause. I'm looking at alternative approaches.

Syndicated 2010-02-17 21:56:54 from Matthew Garrett

Shaping young minds

I'm off to CMU at the weekend, in order to do a couple of talks on Monday (the 8th). I'll be giving an introduction to ACPI to the operating systems class in the morning, and an open presentation on Fedora, some of the challenges we face and how to get involved in Linux in the afternoon. This is as a result of our cooperation with CMU, which has led to things like the request on the right. How could we refuse?

Syndicated 2010-02-05 18:54:34 from Matthew Garrett

237 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!