Older blog entries for mjg59 (starting at number 323)

The economic incentive to violate the GPL

My post yesterday on how Google gains financial benefit from vendor GPL violations contained an assertion that some people have questioned - namely, "unscrupulous hardware vendors save money by ignoring their GPL obligations". And, to be fair, as written it's true but not entirely convincing. So instead, let's consider "unscrupulous hardware vendors have economic incentives to ignore their GPL obligations".

The direct act of compliance costs money


Complying with the GPL means having the source code that built the binaries you ship. This is easy if your workflow involves putting source in at one end and getting binaries out at the other, but getting to that workflow means having a certain degree of engineering rigour. If your current build process involves mixing a bunch of known good binaries you got from somewhere but you can't remember where with a hacked up source tree that exists on someone's hard drive and then pushing all of these into a tool that only runs on Windows ME, before taking the resulting image and replacing chunks of it by hand, compliance is effectively impossible.

We all know that this is against all kinds of best practices and probably causes so many problems that it's more expensive in the long term, but retooling and hiring someone to oversee all of this takes time and money, and given the margins on many of these devices that's probably enough to make you uncompetitive for a couple of product cycles. Maybe you'll be in a better position afterwards, but you don't know that there'll be an afterwards.

Suppliers who don't provide you with the source code may be cheaper than those who do


You can't be in compliance if you don't have the source code in the first place. The same arguments that apply to the hardware vendors also apply to the people selling you your chips, so there's also an economic incentive for them to avoid complying. And there's an obvious incentive for you to choose the cheaper chipset, even if they don't comply.

Getting the source may cost money


Buying a chipset doesn't necessarily get you the software that makes it work - several silicon vendors will charge you for the SDK. But many of these devices are effectively reference platforms, so are basically identical from a hardware perspective. So if one of your competitors paid for the SDK, you can just dump the binaries off their machine, flash them onto your own boards and save yourself a decent amount of money. You obviously don't get the source, and nor do you have the standing to insist that the vendor whose binaries you misappropriated give you the source.

In the absence of enforcement, GPL compliance only works if it's the norm


Let's imagine two companies, A and B. Both build a tablet device, and buy the full SDK including source code. Both find a bunch of bugs in the vendor SDK and fix a different subset of them. They ship. A provides source code. B doesn't. B can now take A's bugfixes and incorporate them, resulting in a more compelling product without any significant extra cost. You now have two products that can sell for the same price, but B's is better. A would need to prove that B copied their bugfixes rather than simply fixing them themselves , which probably isn't going to happen.

In a larger market, if B is the only vendor who does this then their advantage isn't large - some of A's work is misappropriated by B, but A does benefit from the engineering work contributed by C, D, E, F and G. A combination of social pressure and legal threats may bring B into compliance. But if infringement is the norm, A has no incentive at all to release the source - by doing so they'll be helping not only B, but also C, D, E, F and G. Everyone undercuts A and they go out of business quite quickly.

Moral: In the absence of enforcement, if everyone else is infringing, a single company who complies is at a disadvantage.

If compliance cost nothing then everyone would do it


You can argue that cheap tablets from China are infringing simply because nobody knows better. But what's HTC's excuse? They've clearly decided that there's a benefit in holding back their source code releases[1], balancing this against the risk of being sued. They know full well what they're doing. If compliance was free they'd ship the source at the same time as they shipped the binaries. Other significant vendors are also fully aware of their obligations but choose to ignore them anyway.

Summary


There are economic incentives to infringe the GPL, and therefore (all else being equal) an infringing device can be sold for less money. All else being equal, a cheaper device will sell more units. More sales means more devices selling adverts for Google. Google makes more money because Android vendors infringe the GPL.

[1] The usual argument is "We will release the source code within 120 days", implying that it's a process that takes time and we should just be patient. Every single time I've started making threatening noises, the source has appeared within a week.

comment count unavailable comments

Syndicated 2012-01-04 15:08:34 from Matthew Garrett

Android, GPL violations and Google

A bit over a year ago, I wrote about how an incredible number of Android tablets on the market were in violation of the terms of the GPL. I've had rather a lot else to do since then so it's now awfully out of date - but taking a quick browse through the current stack of cheaper devices indicates that things aren't all that much better. We've got source code for some chipsets that were missing it before, but to compensate we've got a whole bunch of new hardware that's entirely lacking. It's all pretty poor, really.

At the time, I wrote the following:

"(Side note: People sometimes ask why Google aren't doing more to prevent infringing devices. For the vast majority of these cases, Google's sole contribution has been to put Android source code on a public website. Red Hat own more of the infringing code than Google do. There's no real reason why Google should be the ones taking the lead role here, and there's fairly sound business reasons why it's not in their interest to do so)"

Factually speaking, nothing's changed. Each of these devices contains code owned by Google, and Google could absolutely take legal action against the vendors. Equally, so could Red Hat, Intel, Nokia and dozens of other companies who hold copyright on portions of the code carried on these devices, and so could thousands of individuals around the world. Nobody's obliged to enforce their copyrights, and in the absence of anyone else doing so it's unreasonable to insist that Google should do it.

However.

Google gives Android away. This seems like an odd thing for them to do, given that it's a significant engineering effort and costs a lot of money to produce. But remember what Android brings to Google - it's a platform with a well-integrated mechanism for distributing advertising to users. Scanning the market shows a huge number of ad-supported apps, and Google's getting money for every single one of those that gets shown. The more Android devices, the bigger the market for apps - and the wider their advertising reach.

In other words: unscrupulous hardware vendors save money by ignoring their GPL obligations. This lets them appeal on price, increasing the number of Android devices in use and increasing Google's profits. Google makes money off other people's violation of the GPL.

Could Google do anything to stop this? Yes. They could sue for copyright infringement, but that kind of thing's time consuming and awkward and any argument about the GPL always seems to end up as a big argument involving conspiracy theories. Instead, Google could attach some extra conditions to the Android trademark. Requiring that the trademark only be attached to GPL-compliant products ought to allow Google to take advantage of the existing well-tested mechanisms for seizing counterfeit goods, providing a direct economic incentive for companies to come into compliance. For added marks, they could restrict the adwords code to devices that use the trademark - if the vendor removes the trademark, applications depending on the adwords functionality would refuse to run and Google wouldn't make money off the infringing hardware.

Or, of course, they could just carry on making extra money as a result of vendors denying users the freedoms granted by the copyright holders. Although that sounds kind of evil to me.

comment count unavailable comments

Syndicated 2012-01-04 03:11:38 from Matthew Garrett

TVs are all awful

A discussion a couple of days ago about DPI detection (which is best summarised by this and this and I am not having this discussion again) made me remember a chain of other awful things about consumer displays and EDID and there not being enough gin in the world, and reading various bits of the internet and wikipedia seemed to indicate that almost everybody who's written about this has issues with either (a) technology or (b) English, so I might as well write something.

The first problem is unique (I hope) to 720p LCD TVs. 720p is an HD broadcast standard that's defined as having a resolution of 1280x720. A 720p TV is able to display that image without any downscaling. So, naively, you'd expect them to have 1280x720 displays. Now obviously I wouldn't bother mentioning this unless there was some kind of hilarious insanity involved, so you'll be entirely unsurprised when I tell you that most actually have 1366x768 displays. So your 720p content has to be upscaled to fill the screen anyway, but given that you'd have to do the same for displaying 720p content on a 1920x1080 device this isn't the worst thing ever in the world. No, it's more subtle than that.

EDID is a standard for a blob of data that allows a display device to express its capabilities to a video source in order to ensure that an appropriate mode is negotiated. It allows resolutions to be expressed in a bunch of ways - you can set a bunch of bits to indicate which standard modes you support (1366x768 is not one of these standard modes), you can express the standard timing resolution (the horizontal resolution divided by 8, followed by an aspect ratio) and you can express a detailed timing block (a full description of a supported resolution).

1366/8 = 170.75. Hm.

Ok, so 1366x768 can't be expressed in the standard timing resolution block. The closest you can provide for the horizontal resolution is either 1360 or 1368. You also can't supply a vertical resolution - all you can do is say that it's a 16:9 mode. For 1360, that ends up being 765. For 1368, that ends up being 769.

It's ok, though, because you can just put this in the detailed timing block, except it turns out that basically no TVs do, probably because the people making them are the ones who've taken all the gin.

So what we end up with is a bunch of hardware that people assume is 1280x720, but is actually 1366x768, except they're telling your computer that they're either 1360x765 or 1368x769. And you're probably running an OS that's doing sub-pixel anti-aliasing, which requires that the hardware be able to address the pixels directly which is obviously difficult if you think the screen is one size and actually it's another. Thankfully Linux takes care of you here, and this code makes everything ok. Phew, eh?

But ha ha, no, it's worse than that. And the rest applies to 1080p ones as well.

Back in the old days when TV signals were analogue and got turned into a picture by a bunch of magnets waving a beam of electrons about all over the place, it was impossible to guarantee that all TV sets were adjusted correctly and so you couldn't assume that the edges of a picture would actually be visible to the viewer. In order to put text on screen without risking bits of it being lost, you had to steer clear of the edges. Over time this became roughly standardised and the areas of the signal that weren't expected to be displayed were called overscan. Now, of course, we're in a mostly digital world and such things can be ignored, except that when digital TVs first appeared they were mostly used to watch analogue signals so still needed to overscan because otherwise you'd have the titles floating weirdly in the middle of the screen rather than towards the edges, and so because it's never possible to kill technology that's escaped into the wild we're stuck with it.

tl;dr - Your 1920x1080 TV takes a 1920x1080 signal, chops the edges off it and then stretches the rest to fit the screen because of decisions made in the 1930s.

So you plug your computer into a TV and even though you know what the resolution really is you still don't get to address the individual pixels. Even worse, the edges of your screen are missing.

The best thing about overscan is that it's not rigorously standardised - different broadcast bodies have different recommendations, but you're then still at the mercy of what your TV vendor decided to implement. So what usually happens is that graphics vendors have some way in their drivers to compensate for overscan, which involves you manually setting the degree of overscan that your TV provides. This works very simply - you take your 1920x1080 framebuffer and draw different sized black borders until the edge of your desktop lines up with the edge of your TV. The best bit about this is that while you're still scanning out a 1920x1080 mode, your desktop has now shrunk to something more like 1728x972 and your TV is then scaling it back up to 1920x1080. Once again, you lose.

The HDMI spec actually defines an extension block for EDID that indicates whether the display will overscan or not, but doesn't provide any way to work out how much it'll overscan. We haven't seen many of those in the wild. It's also possible to send an HDMI information frame that indicates whether or not the video source is expecting to be overscanned or not, but (a) we don't do that and (b) it'll probably be ignored even if we did, because who ever tests this stuff. The HDMI spec also says that the default behaviour for 1920x1080 (but not 1366x768) should be to assume overscan. Charming.

The best thing about all of this is that the same TV will often have different behaviour depending on whether you connect via DVI or HDMI, but some TVs will still overscan DVI. Some TVs have options in the menu to disable overscan and others don't. Some monitors will overscan if you feed them an HD resolution over HDMI, so if you have HD content and don't want to lose the edges then your hardware needs to scale it down and let the display scale it back up again. It's all awful. I recommend you drink until everything's already blurry and then none of this will matter.

comment count unavailable comments

Syndicated 2012-01-03 17:46:40 from Matthew Garrett

Clarifying the "secure boot attack"

Yesterday I wrote about an alleged attack on the Windows 8 secure boot implementation. As I later clarified, it turns out that the story was, to put it charitably, entirely wrong. The attack is a boot kit targeted towards BIOS-based boots. It lives in the MBR. It'll never be executed on any UEFI systems, let alone secure boot ones. In fact, this is precisely the kind of attack that secure boot is intended to protect against. So, context.

The MBR contains code that's executed by the BIOS at boot time. This code is unverifiable - it's permitted to have arbitrary functionality. There's only 440 bytes, but that's enough to jump to somewhere else and read code from elsewhere. There's no way for the BIOS to know that this code is malicious. And one thing this code can obviously do is load the normal boot code and modify it to behave differently. Any self-validation code in the loader can be patched out at this point. The modified loader will then load the kernel, and potentially also modify it. At this point, you've lost. Any attempts to validate the code can be redirected to the original code and so everything will look fine, up until the point where the user runs a specific application and suddenly your kernel is sending all your keystrokes over UDP to someone in Nigeria.

These attacks exist now. They're in the wild. In a normal UEFI world you'd do the same thing by just replacing the UEFI bootloader. But with secure boot you'll be able to validate that the bootloader is appropriately signed and if someone's modified it you'll drop into some remediation mode that recovers your files, from install media if necessary.

Obviously, this protection is based on all the components of secure boot (ie, everything that runs before ExitBootServices() is called) being perfect. As I said, if any of them accept untrusted input and misinterpret it in such a way that they can be tricked into running arbitrary code, you'll still have problems. But when discussing the pros and cons of secure boot, it's important to make sure that we're talking about reality rather than making provably false assertions.

comment count unavailable comments

Syndicated 2011-11-18 21:03:08 from Matthew Garrett

Attacks on secure boot

This is interesting. It's obviously lacking in details yet, but it does highlight one weakness of secure boot. The security for secure boot is all rooted in the firmware - there's no external measurement to validate that everything functioned as expected. That means that if you can cause any trusted component to execute arbitrary code then you've won. So, what reads arbitrary user data? The most obvious components are any driver that binds to user-controlled hardware, any filesystem driver that reads user-provided filesystems and any signed bootloader that reads user-configured data. A USB drive could potentially trigger a bug in the USB stack and run arbitrary code. A malformed FAT filesystem could potentially trigger a bug in the FAT driver and run arbitrary code. A malformed bootloader configuration file or kernel could potentially trigger a bug in the bootloader and run arbitrary code. It may even be possible to find bugs in the PE-COFF binary loader. And once you have the ability to run arbitrary code, you can replace all the EFI entry points and convince the OS that everything is fine anyway.

None of this should be surprising. Secure boot is predicated upon the firmware only executing trusted material until the OS handoff. If you can sidestep that restriction then the entire chain of trust falls down. We're talking about a large body of code that was written without the assumption that it would have to be resistant to sustained attack, and which has now been put in a context where people are actively trying to break it. Bugs are pretty inevitable. I'd expect a lot of work to be done on firmware implementations between now and Windows 8 ship date.

comment count unavailable comments

Syndicated 2011-11-17 16:52:03 from Matthew Garrett

GPT disks in a BIOS world

Starting with Fedora 16 we're installing using GPT disklabels by default, even on BIOS-based systems. This is worth noting because most BIOSes have absolutely no idea what GPT is, which you'd think would create some problems. And, unsurprisingly, it does. Shock. But let's have an overview.

GPT, or GUID Partition Table, is part of the UEFI specification. It defines a partition table format that allows up to 128 partitions per disk, with 64 bit start and end values allowing partitions up to 9.4ZB (assuming 512 byte blocks). This is great, because the existing MBR partitioning format only allows up to 2.2TB when using 512 byte blocks. But most BIOSes (and most older operating systems) don't understand GPT, so plugging in a GPT-partitioned disk would result in the system believing that the drive was uninitialised. This is avoided by specifying a protective MBR. This is a valid MBR partition table with a single partition covering the entire disk (or the first 2.2TB of the disk if it's larger than that) and the partition type set to 0xee ("GPT Protective"). GPT-unaware BIOSes and operating systems will see a partition they don't understand and simply ignore it.

But how do we boot a GPT-labelled disk with a protective MBR on a system that doesn't understand GPT? The key here is that BIOS is pretty dumb. Typically a BIOS will see a disk and just attempt to execute the code in the first sector. This MBR code knows how to do the rest of the boot, including parsing the partition table if necessary. The BIOS doesn't need to care at all.

Of course, some BIOSes choose to care. We've seen a small number of machines that, when exposed to a GPT disk, refuse to boot because they parse the MBR partition map and don't like what they see. This is typically accompanied by a message along the lines of "No operating system found". What we've found is that they're looking for a partition marked with the bootable flag, and if no partitions are marked bootable they assume that there's no OS. This is in contrast to the traditional use of the flag, which is merely a hint to the MBR as to which partition boot code it should execute.

So, should we set that flag? The UEFI specification specifically forbids it - table 15 states that the BootIndicator byte must be set to 0. Once again we're left in an unfortunate position where the specification and reality collide in an awkward way.

If this happens to you after a Fedora 16 install, you have two choices. The first is to reinstall with the "nogpt" boot argument. The installer will then set up a traditional MBR partition table. The second is to boot off a live CD and run fdisk against the boot disk. It'll give a bunch of scary warnings. Ignore them. Hit "a", then "1", then "w" to write it to disk. Things ought to work then. We'll figure out something better for F17.

comment count unavailable comments

Syndicated 2011-11-17 15:54:30 from Matthew Garrett

Making timeouts work with suspend

A reasonably common design for applications that want to run code at a specific time is something like:

time_t wakeup_time = get_next_event_time();
time_t now = time(NULL);
sleep(wakeup_time-now);

This works absolutely fine, except that sleep() ignores time spent with a suspended system. If you sleep(3600) and then immediately suspend for 45 minutes you'll wake up after 105 minutes, not 60. Which probably isn't what you want. If you want a timer that'll expire at a specific time (or immediately after resume if that time passed during suspend), use the POSIX timer family (timer_create, timer_settime and friends) with CLOCK_REALTIME. It's a signal-driven interface rather than a blocking one, so implementation may be a little more complicated, but it has the advantage of actually working.

comment count unavailable comments

Syndicated 2011-11-17 13:51:47 from Matthew Garrett

Properly booting a Mac

This is mostly for my own reference, but since it might be useful to others:

By "Properly booting" I mean "Integrating into the boot system as well as Mac OS X does". The device should be visible from the boot picker menu and should be selectable as a startup disk. For this to happen the boot should be in HFS+ format and have the following files:

  • /mach_kernel (can be empty)
  • /System/Library/CoreServices/boot.efi (may be booted, if so should be a symlink to the actual bootloader)
  • /System/Library/CoreServices/SystemVersion.plist which should look something like
    <xml version="1.0" encoding="UTF-8"?>
    <plist version="1.0">
    <dict>
            <key>ProductBuildVersion</key>
            <string></string>
            <key>ProductName</key>
            <string>Linux</string>
            <key>ProductVersion</key>
            <string>Fedora 16</string>
    </dict>
    </plist>
That's enough to get it to appear in the startup disk boot pane. Getting it in the boot picker requires it to be blessed. You probably also want a .VolumeIcon.icns in / in order to get an appropriate icon.

Now all I need is an aesthetically appealing boot loader.

comment count unavailable comments

Syndicated 2011-11-09 22:06:07 from Matthew Garrett

Understanding the current state of UEFI

This story has been floating around for a week or so. The summary is that someone bought a system that has UEFI and is having trouble installing Linux on it. In itself, not a problem. But various people have either conflated this with the secure boot issue or suggested that UEFI is a fundamentally anti-Linux technology.

Right now there are no machines shipping to the public with secure boot enabled. None at all. If you're having problems installing Linux on a machine with UEFI then it's not because of secure boot. So what is actually causing the problem?

UEFI is a complicated specification, with 2.3.1A being 2214 pages long. It's a large body of code. There's a lot of subtleties. It's very easy for people to get things wrong. For example, we've seen issues where calling SetVirtualAddressMap() resulted in the firmware referencing boot services code, a clear violation of the spec on the firmware authors' part. We've also found machines that failed to boot because grub wasn't aligning its stack properly, a clear violation of the spec on our part.

Software is difficult. People make mistakes. When something mysteriously fails to work the immediate assumption should be that you've found a bug, not a conspiracy. Over time we'll find those bugs and fix them, but until then just treat UEFI boot failures like any other bug - annoying, but not malicious.

comment count unavailable comments

Syndicated 2011-11-03 17:47:36 from Matthew Garrett

UEFI secure boot white paper

As people have probably noticed, last week we published a white paper on the UEFI secure boot issue. This was written in collaboration with Canonical and wouldn't have been possible without the tireless work of Jeremy Kerr. It's the kind of problem that affects the entire Linux community, and I'm glad that we've both demonstrated that being competitors is less important than working together for the benefit of everyone.

comment count unavailable comments

Syndicated 2011-11-03 17:47:12 from Matthew Garrett

314 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!