17 Apr 2008 mjg59   » (Master)

This week I'm in Mountain View for the X Developers' Conference, where I gave a presentation on things graphics drivers have to care about in order to make suspend/resume work reliably. Various people suggested that I write up an explanation of the current situation, along with some best practices for manufacturers.

One of the major differences between the ACPI specification and the older APM specification is that saving and restoring hardware state is generally left up to the operating system, with the firmware playing a much more passive role. Linux is now able to handle this state restoration for the majority of hardware, but we still have issues with graphics. Modern graphics hardware can be hugely complicated, which to a large extent we've been able to ignore because the platform will initialise it on boot. Many X drivers know how to switch a card from text mode to graphics mode (and vice-versa), but that's not adequate.

What can make this situation even more awkward is that the ACPI specification makes no guarantees about what state the hardware comes up in. It may still be in PCI D3. It may be powered up, but the registers may all be zeroed. It's even valid for the hardware to come up in text mode. A driver has to be able to deal with all of these situations.

So far, we've mostly tried to deal with this situation by running chunks of the video BIOS with applications like vbetool. This is hacky and often works badly - in some cases, it won't work at all (nvidia rewrite the entry point to their video BIOS after boot, in order to prevent situations where someone attempts to POST the card and it tries to jump to bits of the system BIOS that no longer exist). The only long-term option is for the kernel to handle the graphics suspend/resume. There's several levels at which this can be supported:

  • Able to restore text mode. This is basically adequate right now, since the kernel enforces a virtual terminal switch around suspend/resume.
  • Able to restore arbitrary video mode. In the kernel modesetting world, we want to be able to go straight into the mode that we're running X in, in order to reduce screen flicker. This would probably still be surrounded with a VT switch out of and into X.
  • Able to jump straight into X. Right now, on x86 the process freezer will prevent X from running until after all of the devices have been resumed. As we move to a model where we remove the freezer and expect drivers to be able to block userspace tasks, we hit a pretty obvious problem with X - right now it hits the graphics hardware directly, rather than going via the kernel. With nothing to block it, it'll hit the hardware while it's still in an undefined state and then become very unhappy. This can't really be fixed until X does all its hardware access through the kernel. For now, we can inhibit X by leaving the VT switching in place.
Since 2.6.25, i915 and later Intel hardware can manage basic mode reprogramming without requiring any external assistance. The radeon kernel modesetting code is also heading in this direction, and we ought to be able to manage something there before too long. The nouveau developers are heading towards the stage of being able to do native mode programming, so there's some hope of progress there as well.

It's not all plain sailing, though. Vendors can wire up hardware in different ways. AMD and Nvidia hardware provides scripts in the ROM that, in theory, abstract this away. The downside to this is that we'll have to put the interpreters for these scripts in kernel space, though this is probably needed for kernel modesetting work anyway. Even then, it's possible that we'll need workarounds for specific pieces of hardware that don't quite behave as we expect them to.

As far as making graphics work over suspend/resume, we have a pretty good idea where we are now (failing on anything other than modern Intel hardware), where we need to be in the future (triumph, huge success, cake) and how to get there (?). However, hardware manufacturers can still help.

Chipset vendors:

Tell us how to reinitialise your hardware. If that involves parsing BIOS tables, then please let us know how to do that. Expecting us to jump into your ROM and execute x86 code is ok if that's genuinely the only way to do it, but for any modern hardware that's clearly not the case.

System vendors:

Try not to alter your system behaviour too wildly. Requiring magic register writes in order to light up the VGA output does not classify as value add. Please don't have your firmware program text mode directly - we're trying to get away from that. And do not attempt to detect that you're running on Linux and alter your system behaviour. If it's possible for you to tell that you're not running on Windows, that's a bug in Linux. If you rely on that bug, you will break. I'm not joking here in the slightest. The closest thing to a "standard" interface between an OS and the hardware is whatever Windows does, and we're basically aiming for bug-for-bug compatibility there.

Syndicated 2008-04-17 20:06:04 from Matthew Garrett

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!