Older blog entries for mjg59 (starting at number 308)

20 Sep 2011 »

UEFI secure booting

Since there are probably going to be some questions about this in the near future:

The UEFI secure boot protocol is part of recent UEFI specification releases. It permits one or more signing keys to be installed into a system firmware. Once enabled, secure boot prevents executables or drivers from being loaded unless they're signed by one of these keys. Another set of keys (Pkek) permits communication between an OS and the firmware. An OS with a Pkek matching that installed in the firmware may add additional keys to the whitelist. Alternatively, it may add keys to a blacklist. Binaries signed with a blacklisted key will not load.

There is no centralised signing authority for these UEFI keys. If a vendor key is installed on a machine, the only way to get code signed with that key is to get the vendor to perform the signing. A machine may have several keys installed, but if you are unable to get any of them to sign your binary then it won't be installable.

This impacts both software and hardware vendors. An OS vendor cannot boot their software on a system unless it's signed with a key that's included in the system firmware. A hardware vendor cannot run their hardware inside the EFI environment unless their drivers are signed with a key that's included in the system firmware. If you install a new graphics card that either has unsigned drivers, or drivers that are signed with a key that's not in your system firmware, you'll get no graphics support in the firmware.

Microsoft requires that machines conforming to the Windows 8 logo program and running a client version of Windows 8 ship with secure boot enabled. The two alternatives here are for Windows to be signed with a Microsoft key and for the public part of that key to be included with all systems, or alternatively for each OEM to include their own key and sign the pre-installed versions of Windows. The second approach would make it impossible to run boxed copies of Windows on Windows logo hardware, and also impossible to install new versions of Windows unless your OEM provided a new signed copy. The former seems more likely.

A system that ships with only OEM and Microsoft keys will not boot a generic copy of Linux.

Now, obviously, we could provide signed versions of Linux. This poses several problems. Firstly, we'd need a non-GPL bootloader. Grub 2 is released under the GPLv3, which explicitly requires that we provide the signing keys. Grub is under GPLv2 which lacks the explicit requirement for keys, but it could be argued that the requirement for the scripts used to control compilation includes that. It's a grey area, and exploiting it would be a pretty good show of bad faith. Secondly, in the near future the design of the kernel will mean that the kernel itself is part of the bootloader. This means that kernels will also have to be signed. Making it impossible for users or developers to build their own kernels is not practical. Finally, if we self-sign, it's still necessary to get our keys included by ever OEM.

There's no indication that Microsoft will prevent vendors from providing firmware support for disabling this feature and running unsigned code. However, experience indicates that many firmware vendors and OEMs are interested in providing only the minimum of firmware functionality required for their market. It's almost certainly the case that some systems will ship with the option of disabling this. Equally, it's almost certainly the case that some systems won't.

It's probably not worth panicking yet. But it is worth being concerned.

comments

Syndicated 2011-09-20 18:23:20 from Matthew Garrett

1 Sep 2011 »

The Android/GPL situation

There was another upsurge in discussion of Android GPL issues last month, triggered by couple of posts by Edward Naughton, followed by another by Florian Mueller. The central thrust is that section 4 of GPLv2 terminates your license on violation, and you need the copyright holders to grant you a new one. If they don't then you don't get to distribute any more copies of the code, even if you've now come into compliance. TLDR; most Android vendors are no longer permitted to distribute Linux.

I'll get to that shortly. There's a few other issues that could do with some clarification. The first is Naughton's insinuation that Google are violating the GPL due to Honeycomb being closed or their "license washing" of some headers. There's no evidence whatsoever that Google have failed to fulfil their GPL obligations in terms of providing source to anyone who received GPL-covered binaries from them. If anyone has some, please do get in touch. Some vendors do appear to be unwilling to hand over code for GPLed bits of Honeycomb. That's an issue with the vendors, not Google.

His second point is more interesting, but the summary is "Google took some GPLed header files and relicensed them under Apache 2.0, and they've taken some other people's GPLv2 code and put it under Apache 2.0 as well". As far as the headers go, there's probably not much to see here. The intent was to produce a set of headers for the C library by taking the kernel headers and removing the kernel-only components. The majority of what's left is just structure definitions and function prototypes, and is almost certainly not copyrightable. And remember that these are the headers that are distributed with the kernel and intended for consumption by userspace. If any of the remaining macros or inline functions are genuinely covered by the GPLv2, any userspace application including them would end up a derived work. This is clearly not the intention of the authors of the code. The risk to Google here is indistinguishable from zero.

How about the repurposing of other code? Naughton's most explicit description is:

For example, Android uses “bootcharting” logic, which uses “the 'bootchartd' script provided by www.bootchart.org, but a C re-implementation that is directly compiled into our init program.” The license that appears at www.bootchart.org is the GPLv2, not the Apache 2.0 license that Google claims for its implementation.

, but there's no indication that Google's reimplementation is a derived work of the GPLv2 original.

In summary: No sign that Google's violating the GPL.

Florian's post appears to be pretty much factually correct, other than this bit discussing the SFLC/Best Buy case:

I personally believe that intellectual property rights should usually be enforced against infringing publishers/manufacturers rather than mere resellers, but that's a separate issue.

The case in question was filed against Best Buy because Best Buy were manufacturing infringing devices. It was a set of own-brand Blu Ray players that incorporated Busybox. Best Buy were not a mere reseller.

Anyway. Back to the original point. Nobody appears to disagree that section 4 of the GPLv2 means that violating the license results in total termination of the license. The disagreement is over what happens next. Armijn Hemel, who has done various work on helping companies get back into compliance, believes that simply downloading a new copy of the code will result in a new license being granted, and that he's received legal advice that supports that. Bradley Kuhn disagrees. And the FSF seem to be on his side.

The relevant language in v2 is:

You may not copy, modify, sublicense, or distribute the Program except as expressly provided under this License. Any attempt otherwise to copy, modify, sublicense or distribute the Program is void, and will automatically terminate your rights under this License.

The relevant language in v3 is:

You may not propagate or modify a covered work except as expressly provided under this License. Any attempt otherwise to propagate or modify it is void, and will automatically terminate your rights under this License

which is awfully similar. However, v3 follows that up with:

However, if you cease all violation of this License, then your license from a particular copyright holder is reinstated (a) provisionally, unless and until the copyright holder explicitly and finally terminates your license, and (b) permanently, if the copyright holder fails to notify you of the violation by some reasonable means prior to 60 days after the cessation.

In other words, with v3 you get your license back providing you're in compliance. This doesn't mesh too well with the assumption that you can get a new license by downloading a new copy of the software. It seems pretty clear that the intent of GPLv2 was that the license termination was final and required explicit reinstatement.

So whose interpretation is correct? At this point we really don't know - the only people who've tried to use this aspect of the GPL are the SFLC, and as part of their settlements they've always reinstated permission to distribute Busybox. There's no clear legal precedent. Which makes things a little awkward.

It's not possible to absolutely say that many Android distributors no longer have the right to distribute Linux. But nor is it possible to absolutely say that they haven't lost that right. Any sufficiently motivated kernel copyright holder probably could engage in a pretty effective shakedown racket against Android vendors. Whether they will do remains to be seen, but honestly if I were an Android vendor I'd be worried. There's plenty of people out there who hold copyright over significant parts of the kernel. Would you really bet on all of them being individuals of extreme virtue?

comments

Syndicated 2011-09-01 21:39:36 from Matthew Garrett

26 Jul 2011 »

Further adventures in EFI booting

Many people still install Linux from CDs. But a growing number install from USB. In an ideal world you'd be able to download one image that would let you do either, but it turns out that that's quite difficult. Shockingly enough, it's another situation where the system firmware exists to make your life difficult.

Booting a hard drive is pretty easy. The BIOS reads the first 512 bytes off the drive, copies them to RAM and executes them. That code is then responsible for either starting your bootloader or identifying the currently active partition and jumping to its boot sector, but before too long you're in a happy place where you're executing whatever you want to. Life is good. So you'd think that CDs would work in a similar way. The ISO 9660 format even leaves a whole 32KB at the start of a filesystem, which is enough space for a pretty awesome bootloader. But no. This is not how CDs work. That would be far too easy.

Let's imagine we're back in the 90s. People want to be able to boot off CD without needing a boot floppy to do so. And you're a PC vendor with a BIOS that's been lovingly[1] forced into a tiny piece of flash and which has to execute out of an almost as tiny piece of RAM if you want your users to be able to play any games. Letting boot code read arbitrary content off the CD would mean adding a new set of interrupt hooks, and that's going to be even more complicated because CDs have a sector size of 2K while hard drives are 512 bytes[2] and who's going to pay to implement this and for the extra flash and RAM and look surely there has to be another way?

So, of course, another way was found. The El Torito specification defines a way for shoving a reference to some linear blocks into the ISO 9660 header. The BIOS reads those blocks into memory and then redirects either the floppy or hard drive access interrupts (depending on the El Torito type) to that region. The boot code can then proceed as if it had been read off a floppy without all the trouble of actually putting a floppy in the machine, and the extra code required in the system BIOS is minimal.

USB sticks, however, are treated as hard drives. The BIOS won't look for El Torito images on them. Instead, it'll try to execute a boot sector. That isn't there on a CD image. Sigh.

A few years ago a piece of software called isohybrid popped up and solved this problem nicely. isohybrid is a companion to isolinux, which itself is a bootloader that fits into an El Torito image and can then load your kernel and installer from CD. isohybrid takes an ISO image, adds an x86 boot sector and partition table and does some more fiddling to turn a valid ISO image into one that can be copied directly onto a USB stick and booted. The world suddenly becomes a better place.

But that's BIOS. EFI makes this easier, right? Right?

No. EFI does not make this easier.

Despite EFI being a modern firmware for the modern world[3], EFI implementations are not required to be able to understand ISO 9660. In fact, I've never seen one that does. FAT is all the spec requires, and FAT is typically all you get. Nor will EFI just execute some arbitrary boot code from the start of the CD. So, how does EFI boot off CD?

El Torito. Obviously.

It's not quite as bad as it sounds, merely almost as bad as it sounds. While the typical way of using El Torito for a long time was to use floppy or hard drive emulation, it also supports a "No emulation" mode. It also supports setting a type flag for your media, which means you can distinguish between images intended for BIOS booting and EFI booting. But the fact remains that your CD has to include an embedded FAT partition that then contains a bootloader that's able to read ISO 9660 because your firmware is too inept to handle that itself[4].

How about USB sticks? Thankfully, booting these on EFI doesn't require any boot sectors at all. Instead you just have to have a partition table, a FAT partition and a bootloader in a well known location in that FAT partition. The required partition is, in fact, identical to the one you need in an El Torito image. And so this is where we start introducing some extra hacks.

Like I said earlier, isohybrid fakes up an MBR and adds some boot code that points at the actual bootloader. It needs to do a little more on EFI. The first problem is that the isohybrid MBR partition has to cover the entire ISO 9660 filesystem on the USB stick so that the operating system can access it later, but the El Torito FAT image is inside that partition. A lot of MBR-based code becomes very unhappy if you try to set up a partition that's a subset of another partition. So we can't really use MBR. On to GPT.

GPT, or the GUID Partition Table, is the EFI era's replacement for MBR partitions. It has two main advantages over MBR - firstly it can cover partitions larger than 2TB without having to increase sector size, and secondly it doesn't have the primary/logical partition horror that still makes MBR more difficult than it has any right to be. The format is pretty simple - you have a header block 1 logical block into the media (so 512 bytes on a typical USB stick), and then a pointer to a list of partitions. There's then a secondary table one block from the end of the disk, which points at another list of partitions. Both blocks have multiple CRCs that guarantee that neither the header nor the partition list have been corrupted. It turns out to be a relatively straightforward modification of isohybrid to get it to look for a secondary EFI image and construct a GPT entry pointing at it. This works surprisingly well, and media prepared this way will boot EFI machines if burned to a CD or written to a USB stick.

There's a few quirks. Macs will show two boot icons for these CDs[6], one marked "EFI Boot" and one helpfully marked "Windows"[7], with the latter booting the BIOS El Torito image. That's a little irritating, but not insurmountable. The other issue is that older Macs won't look for boot loaders in the legacy locations. This is where things start getting horrible.

Back in the old days, Apple boot media used to have a special "blessed" folder. Attempting to boot would involve the firmware looking for such a folder and then using that to start itself up. Any folder in the filesystem could be blessed. Modern hardware doesn't use boot folders, but does use boot files. For an HFS+ filesystem, the inode of the bootloader is written to a specific offset in the filesystem superblock and the firmware simply finds that inode and executes it. And this appears to be all that older Macs support.

So, having written a small tool to bless an HFS+ partition, I tried the obvious first step of burning a CD with three El Torito images (one BIOS, one FAT, one HFS+). It failed. While Refit could see the bootloader in the HFS+ image, the firmware appeared to have no interest at all in booting off it. Yet Apple install media would boot. What was the difference?

The difference, obviously, was that these earlier Macs don't appear to support El Torito booting. The Apple install media contained an Apple partition map.

The Apple partition map (APM) is Apple's legacy partition table format. Apple mostly dropped it when they went to x86, where it's retained for two purposes. The first is for drives that need to be shared between Intel Macs and PPC ones. The second seems to be for their install DVDs. Some further playing revealed that burning a CD with an APM entry pointing at the HFS+ filesystem on the CD gave me a boot icon. Problem solved?

Not really. Remember how I earlier mentioned that ISO 9660 leaves 32KB at the start of the image, and that an isohybrid image then writes an MBR and boot sector in the first 512 bytes of that, and the GPT header starts 512 bytes into a drive? That means that it's easy to produce an ISO that has both a boot sector, MBR partition table and GPT. None of them overlap. APM, on the other hand, has a header that's located at byte 0 of the media, overlapping with the boot sector. And it has a partition listing that's located at sector 1, overlapping with the GPT. Is all lost?

No. Merely sanity.

The first thing to remember is that the boot sector is just raw assembler. It's a byte stream that's executed by the CPU. And there's a lot of things you can tell a CPU to do that result in nothing happening. Peter Jones pointed out that the only bits of the AFP header you actually need are the letters "ER", followed by the sector size as a two byte big endian integer. These disassemble to harmless instructions, so we can simply move the boot code down a little and stick these at the beginning. A PC that executes it will read straight through the bizarre (but harmless) Apple bytes and then execute the real boot code.

The second thing that's important here is that we were just given the opportunity to specify the sector size. The GPT is only relevant when the image is written to a USB stick, so assumes a sector size of 512 bytes. So when the GPT starts one sector into the drive, it's actually starting 512 bytes into the drive. APM also starts one sector into the drive, but we can simply put a different sector size into the header and suddenly we're able to choose where that's going to be. 2K seems like a good choice, and so the firmware will now look for the header at byte 2048.

That's still in them middle of the GPT partition listing, though. Except we can avoid that as well. GPT lets you specify where the partition listing starts and doesn't require it to be immediately after the header. So we can offset the partition listing to, say, byte 8192 and leave a hole for the Apple partition map.

And, shockingly, this works. Setting up a CD this way gives a boot icon on old Macs. On new Macs, it gives three - one for legacy boot, one for EFI boot via FAT and one for EFI boot via HFS. Less than ideal, but eh. The one remaining problem is that this doesn't work for USB sticks (the firmware sees the GPT and ignores the APM), so we also need to add a GPT entry for the HFS+ partition. Job done.

So, it is possible to produce install media that will work if burned to CD or written to a USB stick. It's even possible to produce a version that will work on Macs, as long as you're willing to put up with three partition tables and an x86 boot sector that doubles as an APM header. And patches to isohybrid to do all of this will be turning up as soon as I tidy the code to the point where it works without having to hack in offsets by hand.

[1] Insert some other adverb here if you feel like it
[2] Why yes, 15 years later BIOSes still tend to assume 512 bytes. Which is why your 4K sector disk is much harder to work with than you'd like it to be.
[3] Ever noticed how the modern world involves a great deal of suffering, misery and death? EFI fits into that world perfectly.
[4] Obviously if you want your media to be bootable via both BIOS and EFI you need to produce a CD with two El Torito images. BIOS systems should ignore the image that says it's for EFI, and EFI systems should ignore the BIOS one. Some especially creative BIOS authors[5] have decided that users shouldn't have their choices limited in such a way, and so pop up a screen that says:


1.

2.

Select CD-ROM boot type:

and wait for the user to press a key. The lack of labels after the numbers is not a typographical error on my part.
[5] Older (pre-2009, and some 2009 models) Apple hardware has this bug if a dual-El Torito CD is booted via the BIOS compatibility layer. This is especially unfortunate because said machines often fail to provide a working keyboard emulation at this stage, resulting in you being stuck forever at an impressively unhelpful screen. This isn't a Linux bug, since it's happening before we've run any of our code at all. It's not even limited to Linux. 64-bit install media for Vista SP1, Windows 7 and Server 2008 all have similar El Torito layout and all trigger the same bug on Apple hardware. Apple's aware of this, and has resolved the issue by declaring that these machines don't support 64 bit Windows.
[6] Even further investigation reveals that Apple will show you as many icons as there are El Torito images, which is a rare example of Apple giving the user the freedom to brutally butcher their extremities if they so desire
[7] "Windows" is Apple code for "Booting via BIOS compatibility". The Apple boot menu will call any filesystem with a BIOS boot sector Windows.

Syndicated 2011-07-26 12:48:46 from Matthew Garrett

25 Jul 2011 »

OSCON

Last night, Tim O'Reilly posted about O'Reilly's desire to ensure that conferences they run are free of harassment and as welcoming to as much of the community as possible. This comes on the back of a brief campaign by various people concerned that the absence of such a policy at one of the largest open source conferences was a problem. O'Reilly turned out to be highly responsive, and it's a credit to everyone involved that this got worked out in such a short space of time.

OSCON's an interesting conference. First, it's huge. There's upwards of 15 simultaneous tracks in the main conference. It covers a huge range of topics, from Linux to pretty much any piece of open source middleware you can think of, to web technologies to hardware hacking to free culture. It's a cross section of pretty much everything that can plausibly be described as open source. As such, the demographics are very different to a typical Linux conference, or even a typical single-field technical conference such as PyCon. We may have discussed these issues at length in the Linux community, but that's such a small part of the OSCON audience that it's not surprising that awareness is lower.

Which means this is a pretty big deal. Sexual harassment isn't something that's limited to Linux conferences. It's prevalent throughout the entire open source world. Having a conference that represents a broader part of that world than any other accept that this is a genuine problem is a massive step towards raising awareness of it in the wider community.

Syndicated 2011-07-25 22:25:47 from Matthew Garrett

25 Jul 2011 »

Contributor License Agreement corner cases

If you don't have any interest in tedious licensing discussion then I recommend not reading this. It's not going to be a great deal of fun.

Having said that:

I'm looking at the Canonical Individual Contributor License Agreement (pdf here). In contrast to the previous copyright assignment, it merely grants a broad set of rights to Canonical, including the right to relicense the work under any license they choose. Notably, it does not transfer copyright to Canonical. The contributor retains copyright.

So here's a thought experiment. Canonical release a project under the GPLv3. I produce a significant modification to it. It's now clearly a derived work of Canonical's project, so I have to distribute it under the GPLv3. I have no right to distribute it under a proprietary license. I sign the CLA and provide my modification to Canonical. The grant I give them includes giving them permission to relicense my work under any license they choose. As copyright holders to the original work, they may also change the license of that work. But, notably, I have not granted them copyright to my work. I continue to hold that.

Now someone else decides to extend the functionality that I added. By doing so they are creating a work that's both a derivative work of Canonical's code and also a derivative work of my code. How can they sign the CLA? I granted Canonical the right to grant extra permissions on my work. I didn't grant anyone else that right.

This wasn't a problem with the copyright assignment case, because as copyright holders Canonical could simply grant that permission to all downstream recipients. But Canonical aren't the copyright holder, and unless they explicitly relicense my work I don't see any way that they can accept derivatives of it. The only way I can see this working is if all Canonical code is actually distributed under an implicit license that's slightly more permissive than the GPLv3. But there's nothing saying that it is at present.

Now, I'm obviously not a lawyer. I may be entirely wrong about the above. But asymmetric CLAs introduce an additional level of complexity into the entire process of contributing that make it even more difficult for a potential contributor to become involved. I've spent far more time than most worrying about licenses and even I don't understand exactly what I'm giving up, which is ironic given that a stated aim is usually that they increase certainty about licensing. Is the opportunity to relicense really worth alienating people who would otherwise be doing free work for you?

Syndicated 2011-07-25 17:25:03 from Matthew Garrett

14 Jul 2011 »

Booting with EFI

One of the ways in which EFI is *actually* better than BIOS is its native support for multiple boot choices. All EFI systems should have an EFI system partition which holds the OS bootloaders. Collisions are avoided by operating system vendors registering a unique name here, so there's no risk that Microsoft will overwrite the Fedora bootloader or whatever. After installing the bootloader the OS installer simply sets an NVRAM variable pointing at it, along with a descriptive name and (if they want) sets the default boot variable to point at that. The firmware will then typically provide some mechanism to override that default by providing a menu of all the configured variables.

This obviously doesn't work so well for removable media, where otherwise you'd have an awkward chicken and egg problem or have to force people to drop to a shell and run the bootloader themselves. This is handled by looking for EFI/boot/boot(architecture).efi, where architecture depends on the system type - examples include bootia32.efi, bootia64.efi and bootx64.efi. Since vendors have complete control over their media, there's still no risk of collisions.

Why do we care about collisions? The main reason this is helpful is that it means that there's no single part of the disk that every OS wants to control. If you install Windows it'll write stuff in the MBR and set the Windows partition as active. If you install Linux you'll either have to install grub in the MBR or set the Linux partition as active. Multiple Linux installations, more problems. It's very, very annoying to handle the multiple OS case with traditional BIOS.

This was all fine until UEFI 2.3 added section 3.4.1.2 of the spec, which specifies that in the absence of any configured boot variables it is permitted for the firmware to treat the EFI system partition in the same way as removable media - that is, it'll boot EFI/boot/bootx64.efi or whatever. And, if you install Windows via EFI, it'll install an EFI/boot/bootx64.efi fallback bootloader as well as putting one in EFI/microsoft.

Or, in other words, if your system fails to implement the boot variable section of the specification, Windows will still boot correctly.

As we've seen many times in the past, the only thing many hardware vendors do is check that Windows boots correctly. Which means that it's utterly unsurprising to discover that there are some systems that appear to ignore EFI boot variables and just look for the fallback bootloader instead. The fallback bootloader that has no namespacing, guaranteeing collisions if multiple operating systems are installed on the same system.

It could be worse. If there's already a bootloader there, Windows won't overwrite it. So things are marginally better than in the MBR days. But the Windows bootloader won't boot Linux, so if Windows gets there first we still have problems. The only solution I've come up with so far is to have a stub bootloader that is intelligent enough to scan the EFI system partition for any other bootloaders and present them as a menu, and for every Linux install to just blindly overwrite bootx64.efi if it already exists. Spec-compliant firmware should always ignore this and run whatever's in the boot variables instead.

This is all clearly less than optimal. Welcome to EFI.

Syndicated 2011-07-14 17:14:51 from Matthew Garrett

9 Jun 2011 »

IPv6 routers

I have a WRT-54G. I've had it for some years. It's run a bunch of different firmware variants over that time, but they've all had something in common. There's no way to configure IPv6 without editing text files, installing packages and punching yourself in the face repeatedly. Adam blogged about doing so today, and I suspect he may be in need of some reconstructive surgery now.

I spent yesterday looking at disassembled ACPI tables and working out the sequence of commands the firmware was sending to the hard drive. I'm planning on spending tomorrow writing x86 assembler to parse EFI memory maps. I spend a lot of time caring about stupidly awkward implementation details worked out from staring at binary dumps. The last thing I want to do is have to spend more than three minutes working out how to get IPv6 working on my home network because that cuts into the time I can spend drinking to forget.

Thankfully this is the future and punching yourself in the face is now an optional extra rather than bundled. Recent versions of Tomato USB (ie, newer than actually released) have a nice web UI for this. I registered with Tunnelbroker.net, got a tunnel, copied the prefix and endpoint addresses into the UI, hit save and ever since then NetworkManager has given me a routable IPv6 address. It's like the future.

Because I'm lazy I ended up getting an unofficial build from here. The std built doesn't seem to include IPv6, so I grabbed the miniipv6 one. The cheat-sheet for identifying builds is here. And I didn't edit a single text file. Excellent.

Syndicated 2011-06-09 01:42:27 from Matthew Garrett

7 Jun 2011 »

A use for EFI

Anyone who's been following anything I've written lately may be under the impression that I dislike EFI. They'd be entirely correct. It's an awful thing and I've lost far too much of my life to it. It complicates the process of booting for no real benefit to the OS. The only real advantage we've seen so far is that we can configure boot devices in a vaguely vendor-neutral manner without having to care about BIOS drive numbers. Woo.

But there is something else EFI gives us. We finally have more than 256 bytes of nvram available to us as standard. Enough nvram, in fact, for us to reasonably store crash output. Progress!

This isn't a novel concept. The UEFI spec provides for a specially segregated are of nvram for hardware error reports. This is lovely and not overly helpful for us, because they're supposed to be in a well-defined format that doesn't leave much scope for "I found a null pointer where I would really have preferred there not be one" followed by a pile of text, especially if the firmware's supposed to do something with it. Also, the record format has lots of metadata that I really don't care about. Apple have also been using EFI for this, creating a special variable that stores the crash data and letting them get away with just telling the user to turn their computer off and then turn it back on again.

EFI's not the only way this could be done, either. ACPI specifies something called the ERST, or Error Record Serialization Table. The OS can stick errors in here and then they can be retrieved later. Excellent! Except ERST is currently usually only present on high-end servers. But when ERST support was added to Linux, a generic interface called pstore went in as well.

Pstore's very simple. It's a virtual filesystem that has platform-specific plugins. The platform driver (such as ERST) registers with pstore and the ERST errors then get exposed as files in pstore. Deleting the files removes the records. pstore also registers with kmsg_dump, so when an oops happens the kernel output gets dumped back into a series of records. I'd been playing with pstore but really wanted something a little more convenient than an 8-socket server to test it with, so ended up writing a pstore backend that uses EFI variables. And now whenever I crash the kernel, pstore gives me a backtrace without me having to take photographs of the screen. Progress.

Patches are here. I should probably apologise to Seiji Aguchi, who was working on the same problem and posted a preliminary patch for some feedback last month. I replied to the thread without ever reading the patch and then promptly forgot about it, leading to me writing it all from scratch last week. Oops.

(There's an easter egg in the patchset. First person to find it doesn't win a prize. Sorry.)

Syndicated 2011-06-07 19:54:48 from Matthew Garrett

31 May 2011 »

Rebooting

You'd think it'd be easy to reboot a PC, wouldn't you? But then you'd also think that it'd be straightforward to convince people that at least making some effort to be nice to each other would be a mutually beneficial proposal, and look how well that's worked for us.

Linux has a bunch of different ways to reset an x86. Some of them are 32-bit only and so I'm just going to ignore them because honestly just what are you doing with your life. Also, they're horrible. So, that leaves us with five of them.

kbd - reboot via the keyboard controller. The original IBM PC had the CPU reset line tied to the keyboard controller. Writing the appropriate magic value pulses the line and the machine resets. This is all very straightforward, except for the fact that modern machines don't have keyboard controllers (they're actually part of the embedded controller) and even more modern machines don't even pretend to have a keyboard controller. Now, embedded controllers run software. And, as we all know, software is dreadful. But, worse, the software on the embedded controller has been written by BIOS authors. So clearly any pretence that this ever works is some kind of elaborate fiction. Some machines are very picky about hardware being in the exact state that Windows would program. Some machines work 9 times out of 10 and then lock up due to some odd timing issue. And others simply don't work at all. Hurrah!
triple - attempt to generate a triple fault. This is done by loading an empty interrupt descriptor table and then calling int(3). The interrupt fails (there's no IDT), the fault handler fails (there's no IDT) and the CPU enters a condition which should, in theory, then trigger a reset. Except there doesn't seem to be a requirement that this happen and it just doesn't work on a bunch of machines.
pci - not actually pci. Traditional PCI config space access is achieved by writing a 32 bit value to io port 0xcf8 to identify the bus, device, function and config register. Port 0xcfc then contains the register in question. But if you write the appropriate pair of magic values to 0xcf9, the machine will reboot. Spectacular! And not standardised in any way (certainly not part of the PCI spec), so different chipsets may have different requirements. Booo.
efi - EFI runtime services provide an entry point to reboot the machine. It usually even works! As long as EFI runtime services are working at all, which may be a stretch.
acpi - Recent versions of the ACPI spec let you provide an address (typically memory or system IO space) and a value to write there. The idea is that writing the value to the address resets the system. It turns out that doing so often fails. It's also impossible to represent the PCI reboot method via ACPI, because the PCI reboot method requires a pair of values and ACPI only gives you one.

Now, I'll admit that this all sounds pretty depressing. But people clearly sell computers with the expectation that they'll reboot correctly, so what's going on here?

A while back I did some tests with Windows running on top of qemu. This is a great way to evaluate OS behaviour, because you've got complete control of what's handed to the OS and what the OS tries to do to the hardware. And what I discovered was a little surprising. In the absence of an ACPI reboot vector, Windows will hit the keyboard controller, wait a while, hit it again and then give up. If an ACPI reboot vector is present, windows will poke it, try the keyboard controller, poke the ACPI vector again and try the keyboard controller one more time.

This turns out to be important. The first thing it means is that it generates two writes to the ACPI reboot vector. The second is that it leaves a gap between them while it's fiddling with the keyboard controller. And, shockingly, it turns out that on most systems the ACPI reboot vector points at 0xcf9 in system IO space. Even though most implementations nominally require two different values be written, it seems that this isn't a strict requirement and the ACPI method works.

3.0 will ship with this behaviour by default. It makes various machines work (some Apples, for instance), improves things on some others (some Thinkpads seem to sit around for extended periods of time otherwise) and hopefully avoids the need to add any more machine-specific quirks to the reboot code. There's still some divergence between us and Windows (mostly in how often we write to the keyboard controller), which can be cleaned up if it turns out to make a difference anywhere.

Now. Back to EFI bugs.

Syndicated 2011-05-31 18:45:09 from Matthew Garrett

25 May 2011 »

Trials and tribulations with EFI

I wrote about some EFI implementation issues I'd seen on Macs a while back. Shortly afterwards we started seeing approximately identical bugs on some Intel reference platforms, and fixing it actually became more of a priority.

The fundamental problem is the same. We take the EFI memory map, identify the virtual addresses of the regions that will be required for runtime (mapping them into virtual address space if needed) and then call the firmware's SetVirtualAddressMap() implementation in order to let the firmware convert all its pointers. Sadly it seems that some firmware implementations call into sections of boot services code to do this, which is unfortunate because we've already taken that back to use as RAM. So, given that this is clearly against the spec, how does it ever work?

The tediously dull version is that Linux typically calls SetVirtualAddressMap() in the kernel, and everyone else does it in their bootloaders. The bootloader hasn't set up NX bits or anything, so it just happens to work there. We could just do it in the bootloader in Linux, but that makes doing things like kernel address space randomisation trickier, so it's not the favoured approach. So, instead, we can probably just reserve those ranges until after we've switched to virtual mode, and make sure the pages are executable. This ought to land in 2.6.40, or whatever it ends up being called.

(The alternative approach, of just never transitioning to physical mode, turns out to mysteriously fail on various machines. Calls to SetVariable() just give errors. We just don't know)

That still leaves the problem of SetVariable() on the test Mac trying to access a random address. That one turned out to be easier. There's 2MB of flash at the top of physical address space, and this was being presented as being broken into four separate EFI regions. While physically contiguous, Linux was mapping these to discontiguous virtual addresses. Apple's firmware appeared to assume that a pointer into one region could just be incremented into another. So because it's still easier to change the kernel than change Apple, 2.6.39 merges these regions to ensure they're contiguous.

Remaining problems include some machines seemingly not booting if they have 4GB of RAM or more and this Apple failing to communicate with its panel over the eDP auxchannel. Anyone got any idea how to dump the bios compatibility module out of a running EFI session?

Syndicated 2011-05-25 16:06:03 from Matthew Garrett

299 older entries...