Older blog entries for mjg59 (starting at number 169)

Aggressive graphics power management

My current desktop PC has an RS790-based Radeon on-board graphics controller. It also has a Radeon X1900 plugged in. Playing with my Watts Up, I found that the system was (at idle!) drawing around 35W more power with the X1900 than with the on-board graphics.

This is clearly less than ideal.

Recent Radeons all support dynamic clock gating, a technology where the clocks to various bits of the chip are turned off when not in use. Unfortunately it seems that this is generally already enabled by the BIOS on most hardware, so playing with that didn't give me any power savings. Next I looked at Powerplay, the AMD technology for reducing clocks and voltages. It turns out that my desktop hardware doesn't provide any Powerplay tables, so no joy there either. What next?

Radeons all carry a ROM containing a bunch of tables and scripts written in a straightforward bytecode language called Atom. The idea is that OS-specific drivers can call the Atom tables to perform tasks that are hardware dependent, even without knowledge of the specific low-level nature of the hardware they're driving. You can use Atom to do several things, from card initialisation through mode setting to (crucially) setting the clock frequencies. Jerome Glisse wrote a small utility called Atomtools that lets you execute Atom scripts and set the core and RAM frequencies. Playing with this showed that it was possible to save the best part of 5W by underclocking the graphics core, and about the same again by reducing the memory clock. A total saving of 9-10W was pretty significant.

The main problem with reducing the memory clock was that doing it while the screen is being scanned out results in memory corruption, showing up as big ugly graphical artifacts on the screen. I'm a fan of doing power management as aggressively as possible, which means reclocking the memory whenever the system is idle. Turning the screen off to reclock the memory would avoid the graphical corruption but introduce irritating flicker, so that wasn't really an option. The next plan was to synchronise the memory reclocking to the vertical refresh interval, the period of time between the bottom of a frame and the top of the next frame being drawn. Unfortunately setting the memory frequency took somewhere between 2 and 20 milliseconds, far too long to finish inside that time period.

So. Just using Atom was clearly not going to be possible. The next step was to try writing the registers directly. Looking at the R500 register documentation showed that the MPLL_FUNC_CNTL register contained the PLL dividers for the memory clock. Simply smacking a new value in here would allow changing the frequency of the memory clock with a single register write. It even worked. Almost. I could change the frequency within small ranges, but going any further resulted in increasingly severe graphical corruption. Unlike the sort I got with the Atom approach to changing the frequency, this corruption manifested itself as a range of effects from shimmering on the screen down to blocks of image gradually disappearing in an impressively trippy (though somewhat disturbing) way.

Next step was to perform a register dump before and after changing the frequencies via Atom, and compare them to the registers I was programming. MC_ARB_RATIO_CLK_SEQ was consistently different, which is where things got interesting. The AMD docs helpfully describe this register as "Magic field, please use the excel programming guide. Sets the hclk/sclk ratio in the arbiter", about as helpful as being told that the register contents are defined by careful examination of a series of butterflies kept somewhere in Taiwan. Now what?

Back to Atomtools. Enabling debugging let me watch a dump of the Atom script as it ran. The relevant part of the dump is here. The most significant point was:

src: ID[0x0000+B39E].[31:0] -> 0xFF7FFF7F
dst: REG[0xFE16].[31:0] <- 0xFF7FFF7F
, showing that the value in question was being read out of a table in the video BIOS (ID[0x0000+B39E] indicating the base of the ROM plus 0xB39E). Looking further back showed that WS[0x40] contained a number that was used as an index into the table. Grepping the header files gave 0x40 as ATOM_WS_QUOTIENT, containing the quotient of a division operation immediately beforehand. Working back from there showed that the value was derived from a formula involving the divider frequencies of the memory PLL and the source PLL. Reimplementing that was trivial, and now I could program the same register values. Hurrah!

It didn't work, of course. These things never do. It looked like modifying this value didn't actually do anything unless the memory controller was reinitialised. Looking through the Atom dump showed that this was achieved by calling the MemoryDeviceInit script. Reimplementing this from scratch was one option, but it had a bunch of branches and frankly I'm lazy and that's why I work on this Linux stuff rather than getting a proper job. This particular script was fast, so there was no real reason to do it by hand instead of just using the interpreter. Timing showed that doing so could easily be done within the vblank interval. This time, it even worked.

I've done a proof of concept that involved wedging this into the Radeon DRM code with extreme prejudice, but it needs some rework. However, it demonstrates that it's possible to downclock the memory whenever the screen is idle without there being any observable screen flicker. Combine that with GPU downclocking and we can save about 10W without any noticable degradation in performance or output. Victory!

I gave the code to someone with an X1300 and it promptly corrupted their screen and locked their machine up. Oh well. Turns out that they have a different memory controller or some such madness.

So, obviously, there's more work to be done on this. I've put some test code here. It's a small program that should be run as root. It should reprogram an Atom-based discrete graphics card[1] to half its memory clock. Running it again will halve it again. I don't recommend doing that. You'll need to reboot to get the full clock back. This isn't vblank synced, so it may introduce some graphical corruption. If the corruption is static (ie, isn't moving or flickering) then that's fine. If it's moving then I (and/or the docs) suck and there's still work to be done. If your machine hangs then I'm interested in knowing what hardware you have and may have some further debugging code to be run. Unless you have an X1300, in which case it's known to break and what were you thinking running this code you crazy mad fool.

Once this is stable it shouldn't take long to integrate it into the DRM and X layers. I'm also trying to get hold of some mobile AMD hardware to test what impact we can have on laptops.

[1] Shockingly enough, it's somewhat harder to underclock graphics memory on a shared memory system

Syndicated 2008-11-18 20:20:25 from Matthew Garrett

And another thing

I swear I'm going out in a minute, but:

Running strings on the firmware for a Dlink wireless bridge I have gives output that includes the following:

From isolation / Deliver me o Xbox - / Through the ethernet
Copyright (c) Microsoft Corporation. All Rights Reserved.
Device is Xbox Compatible. Copyright (c) Microsoft Corporation. All Rights Reserved.
. This confused me for a while until I plugged it into an Xbox 360 and discovered that despite it being plugged into the ethernet port, I could control the wifi options including network selection and encryption method. Does anyone have the faintest idea how this is implemented? A tcpdump of the Xbox booting reveals some ICMP6 packets, a bunch of DHCP and some uPNP service discovery. uPNP seems like a plausible option, but I've got no idea how to probe a device for uPNP services using Linux. Has anyone played with reverse engineering this stuff? Googling didn't seem to show anything up.

Syndicated 2008-11-15 20:36:02 from Matthew Garrett

Hybrid suspend

One often requested feature for suspend support on Linux is hybrid suspend, or "suspend to both". This is where the system suspends to disk, but then puts the machine in S3 rather than powering down. If the user resumes without power having been removed they get the benefit of the fast S3 resume. If not, the system resumes from disk and no data is lost.

This is, clearly, the way suspend should work. We're not planning on adding it by default in Fedora, though, for a couple of reasons. The main reason right now is that the in-kernel suspend to disk is still slow. Triggering a suspend to disk on a machine with gigabytes of RAM (which is a basic laptop configuration these days) will leave you sitting there for an extended period of time when all you actually want to do is pick your machine up and leave. Fixing this properly is less than trivial. TuxOnIce improves the speed somewhat, at the expense of being a >500k patch against upstream that touches all sorts of interesting bits of the kernel such as the vm system. We're not supporting that for fairly obvious reasons. But even then, the suspend to disk process involves discarding some pages. Those need to be pulled in from disk again on resume. With the current implementation, suspend to both is fundamentally slower than suspend to RAM for both the suspend and resume paths.

So, what other approaches are there? One is to resume from RAM some period of time after suspending and then if the battery is low suspend to disk. Many recent machines will automatically resume when the battery level becomes critical. If the hardware doesn't support that, we can wake up after a set period and measure the battery consumption, set a new alarm and go to sleep again. The downside of this approach is that your system wakes up and does stuff without you being aware of it, which may be bad if it's inside a neoprane cover at the time. Cooking laptops is generally considered unhelpful.

Using the kexec approach to hibernation provides a more straightforward way of handling the problem. The fundamental problem with the existing approach is that it ties suspend into the vm system and involves making atomic copies of RAM into other bits of RAM. kexec would allow us to pre-allocate enough space on disk to save RAM as-is, and then simply kexec into a new kernel and dump RAM to disk without any of the tedious shrinking required first. Resuming from S3 would kexec back into the old kernel, whereas losing power would just fall back to reading off disk. The extra time taken on the S3 path would be minimal.

In an ideal world we'd adopt the Vista approach where "off" is synonymous with suspend. There's still more work to be done on enhancing reliability before that can be achieved, though.

Syndicated 2008-11-15 18:35:11 from Matthew Garrett

Adventures in PCI hotplug

I played with an Eee for a bit last time I was in Boston, culminating in a patch to make the eeepc-laptop driver use standard interfaces rather than just having random files in /sys that people need to write custom scripts to use. The world became a better place.

However. Asus implemented the rfkill control on the Eee in a slightly odd way. Disabling the wifi actually causes the entire card to drop off the bus, similar to how Bluetooth is normally handled. The difference is that the Bluetooth dongles are almost exclusively USB, while the Eee's wifi is PCI. Linux supports hotplugging of PCI devices, but nothing seemed to work out of the box on the Eee. Another case of this was the SD reader in the Acer Aspire One. Unless a card was present in the slot during boot, it simply wouldn't appear on the PCI bus. It turned out that Acer have implemented things in such a way that removing the card results in the entire chip being unplugged. This was when I started looking more closely into how this functionality is implemented.

The two common cases of PCI hotplug are native PCIe hotplug and ACPI mediated hotplug. In the former case, the chipset generates an interrupt when a hotplug event occurs and the OS then rescans the bus. This is a mildly complicated operation, requiring enabling the slot, checking whether there's a card there, powering the card and all its functions up, waiting for the PCIe link to settle and then announcing the new PCI device to the rest of the OS. ACPI-mediated hotplugging puts more of the load on the firmware rather than the OS - the hotplug event generates a notify message that is caught by the ACPI interpreter in the OS, allowing the OS to check for device presence by calling another ACPI method. If the device is present it's then a simple matter of telling the PCI layer about it.

Native PCIe hotplug has the advantage that there's much less vendor code involved. ACPI is still involved to an extent - an _OSC method on the PCIe bridge is called to allow the OS to tell the firmware that it supports handling hotplug events. This allows the firmware to stop sending any ACPI notifications. ACPI hotplugging requires more support in the firmware, but can work for PCI as well as PCIe.

The general approach taken to getting the Eee's wifi hotplugging to work has been to load the pciehp driver with the pciehp_force=1 argument. This tells the driver to listen for hotplugging events even when there's no _OSC method to tell the firmware that the OS is handling things now. Since the hardware will generate the event anyway, things work. However, this is non-ideal. Some hardware exists where ACPI hotplugging will work, but due to quirks in the hardware design native PCIe hotplugging control will fail. This has been handled in their firmware by having the _OSC method fail, signalling to the pciehp driver that it shouldn't bind to the port. Using pciehp_force overrides that, leading to a situation where hardware could potentially be removed from a port that's powered up. Unfortunate.

My first approach was to add a new argument to pciehp called pciehp_passive. This would indicate to the pciehp driver that it should only listen for notifications from the hardware. User-triggered events would not be supported, avoiding the situation where anyone could remove the card by accident. This worked on my test machine (an Eee 901 somewhere in Ottawa, since I don't actually have one myself...) but was reported to work less well on a 700. Since the 700 didn't claim to have any support for power control, the code was forced to wait a second on every operation to see whether the link powered up or not. This resulted in long pauses during boot and suspend/resume operations.

The final issue that convinced me that this was the wrong approach was reading a document on Microsoft's site on how PCIe hotplugging is implemented in Windows. It turns out that XP doesn't support native PCIe hotplugging at all - that feature was added in Vista. Both the Eee and the Aspire One are available with XP, but things work there. So PCIe native hotplugging was clearly not the right answer. Time to look further.

Armed with a disassembly of the Aspire One's DSDT, I figured out why the ACPI hotplug driver didn't work on it. The first thing the driver does is walk the list of ACPI devices, looking for any that are removable. That was being implemented by looking for an _EJ0 method. _EJ0 indicates that the device can be ejected under the control of the OS. The Aspire One doesn't have an _EJ0 method on its SD readers. However, it did have an _RMV method. This can be used to indicate that a device is removable but not ejectable - that is, the device can be removed (by physically pulling it out or by the hardware taking it away itself), but there's no standard way to ask the OS to logically disconnect it. A quick patch to acpiphp later and the Aspire One now worked without any forcing or spec contravention. This also has the nice side effect of making expresscard hotplug work on a bunch of machines where it otherwise wouldn't.

But back to the Eee. acpiphp still wasn't binding, and a closer examination revealed why. There's nothing to indicate that the Eee's ports are hotpluggable, and there's no topological data in the ACPI tables that ties the wifi function to the PCIe root bridges. However, the Eee firmware was sending an ACPI notification on wifi hotplug. But it was only sending this to the PCIe root bridges, and there's no way to then tell which device had potentially appeared or vanished.

In the end, I gave up on trying to solve this generically. Instead I've got a patch that implements the hotplugging entirely in eeepc-laptop. In an ideal world nobody else will have implemented this in the same way as Asus and we can all be happy.

Syndicated 2008-11-15 18:01:04 from Matthew Garrett


As a brief introduction to this - I first read through the Android code when interviewing with Google for an opening with the Android kernel team. I didn't get the job, and while I don't think anything that follows here is as a result of residual bitterness you might want to take some of it with a pinch of salt.

Anyway. Android is a Linux kernel with a heavily customised user layer that bears little resemblance to any traditional Linux system. I write in C because using pointer arithmetic lets people know that you're virile so this Java-like thing is of little interest to me and I'm going to ignore it and just look at the kernel, because after all that's what I'm paid to be good at.

The short summary of the rest of this post is that I'm not planning on giving up my iphone yet.

The Android kernel source lives in a git tree here. It can pretty much be logically split into two parts - the port to Qualcomm's MSM7xxx series chips and the special Android customisations. A bunch of the MSM7xxx port code has been merged through the ARM tree and is now upstream, and Brian Swetland seems to have been fairly active in looking after that. Full marks to Google there. This code is handy outside the Android world and benefits anyone wanting to run Linux on similar devices.

The Android-specific code is more... interesting.

As I mentioned, the Android application layer isn't Unix in any real sense. The kernel reflects this. It ranges from pragmatic (if hacky) approaches like encoding security policy and capabilities in group IDs that are hardcoded into the kernel through to implementing an in-kernel IPC mechanism (apparently related to the OpenBinder implementation from Palm, but judging by the copyrights a complete rewrite). To an extent, I'm fine with this. Something like Binder is pretty clearly not going upstream, so the fact that it engages in bizarre design decisions like sticking chunks of its interface in /proc is pretty irrelevant. What's more interesting is the code that should be generalisable.

I work on power management, so I'm always interested in what kind of power management functionality and interfaces people want. Plumbers included a nice discussion with someone from an embedded company I can't remember, culminating in us deciding that the existing cpufreq interface did what they wanted and so no new interfaces needed to be defined. Google was going to be an interesting case of a large company hiring people both from the embedded world and also the existing Linux development community and then producing an embedded device that was intended to compete with the very best existing platforms. I had high hopes that this combination of factors would result in the Linux community as a whole having a better idea what the constraints and requirements for high-quality power management in the embedded world were, rather than us ending up with another pile of vendor code sitting on an FTP site somewhere in Taiwan that implements its power management by passing tokenised dead mice through a wormhole.

To a certain extent, my hopes were fulfilled. We got a git server in California.

Android contains something called the android_power driver (ignore the references to CSMI, which is a piece of OMAP-specific hardware for communicating with the baseband - I'm not sure what they're doing in there). As far as I can tell, this is an interface that handles the device being locked and unlocked, and associated powering down of certain bits of hardware. Except it's shit. Devices register device handlers and a level where they wish to be suspended. There's no direct concept of intra-device dependencies so you end up with stuff like:

android_early_suspend_t early_suspend;
android_early_suspend_t slightly_earlier_suspend;
to deal with the fact that the MSM framebuffer driver depends on the MDDI driver having brought the link bank up and so needs its suspend method called before the MDDI driver but its resume method to be called after, and the only way to handle that is to register two methods - the "slightly earlier" one which has a suspend method but no resume, and the "early" one which has a resume method but no suspend one.

Of course, this also means that all of your device runtime power management policy ends up in the kernel. Userspace indicates what state it wants to go to and the kernel decides what's going to get powered down. This kind of coarse grained approach means that as your hardware setup becomes more complex you hit combinatorial explosion. Expressing all the useful combinations of hardware state simply becomes impractical if all you're exposing is a single variable. What would be more useful is the ability for userland to interact with individual pieces of hardware.

The amusing thing is that in many cases Linux already has this. Take a look at the backlight and LCD class drivers, for instance. They provide a trivial mechanism for userspace to indicate its desires and then modify the device power state. It's true that there are other pieces of hardware that don't currently have interfaces to provide this kind of information. And that's where cooperation with the existing community comes in. We've already successfully fleshed out interfaces for runtime power management for several hardware classes, with the main thing blocking us being a lack of awareness of what the use cases for the remaining classes are. But linux-pm has seen nobody from the Android team, and so we end up with a lump of code solving a problem that shouldn't exist.

When Robert Love gave his presentation on Android at Lugradio Live in SF back in April, he talked about how one of the reasons that Google weren't releasing the source for their userland stack until they shipped phones was to prevent it being seen as throwing unfinished code over the wall and then ignoring the community. It's unfortunate that this is almost exactly what's happened with their kernel. Right now the fact that Android is based on Linux is doing almost nothing to benefit the larger Linux community. What could have been a valuable opportunity for us to gain understanding of an interesting problem has instead ended up as yet another embedded vendor kernel, despite all the assurances I got from various people at Google.


Syndicated 2008-11-10 12:38:19 from Matthew Garrett


First, let me make one thing clear. This isn't constructive criticism. This is just criticism. It's directed at software that's so wrong-headed that there's no way to make it significantly better, and everyone involved would be much better spending their time doing something else instead of trying to fix any of what I'm about to describe. It's not worth it. Sit in a park or something instead. Meet new and interesting people. Take up a hobby that doesn't involve writing shell scripts for Linux. You'll be happier. I'll be happier. Everyone wins.

Anyway. I wrote about Automatix some time ago. It died and the world became a better place. More recently it's been resurrected as something called Ultamatix. In summary, don't bother. It's crap. And dangerous. But mostly crap. Again, I'm going to utterly ignore the UI code and just concentrate on what it runs.

  • function cleanup {
    echo "Cleaning up..."
    sudo apt-get autoremove --assume-yes --force-yes
    In other words, "Remove a bunch of packages that might have nothing to do with anything Ultamatix has installed, and don't ask the user first. Oh, and assume yes when asked whether to do anything potentially damaging". This gets called 103 times in various bits of Ultamatix.

  • Oh, notice the sudo in there? Ultamatix is running as root already. Despite this, there are 429 separate calls to sudo.
  • #Test O/S 64 or 32 bit...
    architecture=`uname -m`
    targetarch="x86" #Set 64-bit machines to download 32-bit if no options are set
    if [ "$architecture" != "x86_64" ] && [ "$architecture" != "ia64" ]; then
    It turns out that ia64 is not especially good at running x86_64 binaries. Never mind, eh?
  • rm -rf $AXHOME/.gstreamer-0.10
    sudo gst-inspect
    Which translates as "Delete any self-installed plugins, run gst-inspect as root in an attempt to regenerate the plugin database, really run gst-inspect as root in an attempt to regenerate the plugin database". The flaws in this are left as an exercise for the reader.
  • sudo apt-get --assume-yes --force-yes remove --purge
    Used 111 times. Will remove the packages it installed, but also any other packages the user has installed that happen to depend on them. Without asking.
  • sudo cp /etc/apt/sources.list /etc/apt/sources.bak
    sudo echo "deb http://ppa.launchpad.net/project-neon/ubuntu hardy main" >> /etc/apt/sources.list
    sudo apt-get update
    if !    sudo apt-get install --assume-yes --force-yes amarok-nightly amarok-nightly-tools amarok-nightly-taglib
            AX_fatal "An apt-based error occurred and installation was unsuccessful";
    echo "Restoring sources."
    sudo cp /etc/apt/sources.bak /etc/apt/sources.list
    sudo apt-get update
    The good news is that it backs up your sources.list before breaking things. The bad news is that it's still utterly horrifying.
  • #since we have root we need to discover normal username so we can create the shortcut & set proper permissions
    NU=$(cat /etc/passwd | grep 1000 | cut -d: -f1)
    sudo chown $NU:$NU "legends_linux-"
    sudo chmod +x legends_linux-
    sudo dpkg -i legends_linux-
    List of fail:
    1. Assuming that the user has uid 1000
    2. Chowning a deb to the user for no obvious reason (hint: a user can delete root owned files that are in the user's home directory)
    3. Making a deb executable for no reason whatsoever
    4. Assuming that user information will be in /etc/passwd
    5. Not just, say, passing the user's name to the application IN THE FIRST PLACE
  • sudo apt-get --assume-yes --force-yes install f-spot dvgrab kino devede gtkpod-aac ipod gnupod-tools libgpod-common
    libipod-cil libipoddevice0 libipodui-cil libhfsp0 hfsplus hfsutils libipod0
    If only we had some way of saying that libraries used by programs should automatically be installed when a program is. Wouldn't that be great?
  • echo "Adding mediabuntu repository"
    sudo cp /etc/apt/sources.list /etc/apt/sources.bak
    sudo wget http://www.medibuntu.org/sources.list.d/hardy.list -O /etc/apt/sources.list.d/medibuntu.list
    echo "Restoring sources."
    sudo cp /etc/apt/sources.bak /etc/apt/sources.list
    Yeah. that'll help.

  • The Swiftweasel install that checks your CPU type and then has some insane number of cut and paste code chunks that differ only by the filename of the tarball it grabs. Rather than, say, using a variable and writing the code once.

  • The cutting and pasting of the same code in order to install swiftdove.

  • Code that installs packages differently depending on whether they happened to be in your home directory to start with or whether it had to download them for you
  • if !    DEBIAN_FRONTEND=kde sudo apt-get --assume-yes --force-yesinstall virtualbox
    No, I didn't remove any spaces from that.
  • #create directory incase they installed it elsewhere, no sense in scraping all thier games
    sudo mkdir /usr/local/games/WoP/ 2>/dev/null
    sudo rm -R /usr/local/games/WoP/ 2>/dev/null
    What, create a directory and then immediately delete it? How is this useful in any way whatsoever?

There's almost certainly more. I got bored. The worrying thing about this is that the Ultamatix author read my criticisms of Automatix and appears to have attempted to fix all of them. The problem with this is that there's clearly a complete lack of understanding of the fundamental problem in several cases. For example, one of my criticisms of Automatix:

sudo sed -i "s/^vboxusers\(.*\):$/vboxusers\1:$AXUSER/" /etc/group

- assumes that the system isn't using some sort of user directory service.

and the Ultamatix response:

Fixed...Got rid of Virtualbox

Except exactly the same problem is present at other points in Ultamatix, as noted above. Taking a bug list and slavishly fixing or deleting all the bugs isn't helpful if you then proceed to add the same bug back in 24 other places. In that respect, it's even worse than Automatix - the author's managed to produce a huge steaming pile of shite despite having been told how to avoid doing so beforehand. He may be no newbie to programming, but if not it's a perfect example of how experience doesn't imply competence.

Don't install this package. Don't let anyone else install this package. If you see anyone advocating the installation of this package, call them a fool. There's absolutely no excuse whatsoever for the existence of this kind of crap.

Minor update:
The above was looking at 1.8.0-4. It turns out that there's a 1.8.0-5 that's not linked off the website. There's no substantive difference, but some of the numbers may be slightly different.

Syndicated 2008-11-02 12:16:11 from Matthew Garrett

Keyboard handling

The Linux desktop currently receives (certain) key events in two ways. Firstly, the currently active session will receive keycodes via X. Secondly, a subset of input events will be picked up by hal and sent via dbus. This information is available to all sessions. Which method you use to obtain input will depend on what you want to do:

  • If you want to receive an event like lid close even when you are not the active session, use dbus
  • If you only want to receive an event when you are the active session (this is the common case), just use X events

Syndicated 2008-10-23 14:16:46 from Matthew Garrett


  • I'll be speaking at the UKUUG Linux conference in Manchester this November.
  • The ACM have chosen my article on power management from Queue last year as a shining example of such things, and republished it in Communications where you may now peruse it at your leisure. Fanmail may be sent to the usual addresses.
  • I'll be in Boston from the 7th to 11th of December, and New York from the 11th to 15th. I will be endeavouring not to break any bones in the process. Might actually ensure I have travel insurance this time.
  • I'll be presenting at LCA next January. Current plans involve spending a week in Melbourne afterwards and a few days in San Francisco on the way back.

Things I want to do:
  • Visit Iceland. It sounds like it might be relatively cheap soon.
  • Make this I²C code work.
  • Get dynamic power state switching on ATOM-based Radeons working. This probably involves actually plugging the card in.

Syndicated 2008-10-07 01:44:22 from Matthew Garrett

Russell writes about the iphone. I think he's missing a few things.

The open nature of the PC wasn't inherently what brought it greater success. The open nature of the PC meant that it could spawn an ecosystem of third party hardware vendors, sure. It also meant that it could be cheaply cloned by other manufacturers, ensuring competition that drove down the price of hardware. The net result? x86 is ubiquitous, sufficiently so that even Apple use a basically standard[1] x86 platform these days. Low prices and the wide availability of software that people wanted to run bought the PC the marketplace, with Microsoft being the real winners. Apple hardware remained more expensive for years, and the compelling MacOS software was mostly limited to areas like DTP. Nobody else had any incentive to buy a Mac.

Now, let's look at the phone market. Third party hardware vendors? No real distinction between the iphone and anything else. Sure, anything remotely clever has to plug into the dock port, but developing something to work with that also gets you into the ludicrously huge ipod market. Other phone accessories are either batteries, chargers or headphones. That's really not going to be what determines market success.

Competitive cheapness? When you have a multivendor OS like Android or Windows Mobile, you might expect there to be more opportunity to compete to undercut each other, offering equivalent platforms for less cost. But that's missing something. In the same way that the home computer market has basically consolidated towards PCs, the phone market has already consolidated. Your smartphone has an ARM in, probably along with an off the shelf GSM chip and some 3D core (generally something from Imagination, though in Android's case Qualcomm seem to have come up with their own core - I haven't been able to find out if it's derived from something else). There's no realistic way to make a phone with equivalent hardware functionality and quality to the iphone and sell it for significantly less money. And if you figure out how to, Apple get to take advantage of the same price reductions in their next generation hardware. And, being Apple, they'll probably find some compelling wonderful design feature that costs them nothing extra but makes you want it more anyway. So hardware competition probably isn't going to be what determines market success.

Which leaves two things - advertising and applications. Apple are good at marketing. This is unfortunate, because I'd really rather live in a world where everyone running MacOS was running Linux instead, but we seem to suck in comparison. The good news is that Microsoft also seem to, so maybe we'll have our act together some time between now and Apple crushing us to death. So, assuming current trends continue, Apple's marketing probably isn't going to kill the iphone. Which leaves one thing: applications.

The obvious argument against the iphone's success is that, as a closed software distribution platform, it's less attractive to developers. I don't think that's true. If we look at the console market, the gp2x was hardly a PSP killer. Or a DS killer. You could possibly argue it was a Gizmondo killer, but only if you ignore the Finnish mafia. Being an open platform doesn't immediately result in you killing closed platforms. You need developers, and you need applications. Otherwise nobody's going to buy your hardware, even if it costs $10 less than an iphone and has a few extra bits of plastic. What attracts developers? An attractive development environment and a revenue stream. Android has one real thing going for it here - it's not tied to Objective C, and so there's probably a larger number of potential developers. But let's be realistic. If you're a competent developer, you can move from C++ or Java to Objective C without too much effort. And if you're an incompetent developer, you're not going to be deciding the future of a platform.

Apple have made it easy for people to write applications that share the iphone's delightful[2] UI. There's almost active encouragement to write beautiful programs that integrate well. Sure, the platform limitations bite you in weird ways (like the no background running thing), but Apple have come up with hacks to smooth most of those over. The iphone is a wonderful device to develop for. Sufficiently delightful that there was a huge developer base even before Apple had released the SDK. What does that tell you? Developers actively want to write for the iphone. In fact, they wanted to even before there was a real revenue model. Mindshare means a lot.

What are we going to see in response from Android? To begin with, uglier applications. I'm sure that'll get better over time, but right now the Android UI just isn't as well put together[3]. It's functional, even attractive. But it's not beautiful. And lowering the bar to developer involvement means the potential for more My First Phone Application. Windows Mobile and Symbian have huge numbers of applications. They're mostly dreadful lashups of functionality you'd never want and a UI that's ugly enough to make you want to stab out your eyes, coupled with a nag screen asking for a $10 donation to carry on using it assuming it hasn't crashed before it got that far. To be fair, a lot of iphone stuff isn't much better. But proportionately? Right now the Apple stuff has it. I never want to see another listing of Symbian freeware.

At the moment, Apple wins at providing compelling applications. They may be a gatekeeper between the developer and the user, but right now that's not causing too many problems. Well. It wasn't. The recent fuss about Apple dropping applications because of perceived competition with their own software is an issue. If a developer is going to spend a significant amount of time and money on an application, they want a reasonable reassurance that they're going to be able to ship it. And, right now, Apple's not giving that. It remains to be seen whether this has long term consequences, but there's some danger of Apple alienating their developer base. If those developers move to another platform, and if they create compelling software, Apple might stand some real competition. At the moment? Apple has the hardware, the OS and the applications. They have the potential to take over broad swathes of the market. But they also have the opportunity to throw it away. And that's what's going to decide the success of the iphone - a closed platform is not inherently a problem, but it gives the vendor the option of removing one of their key advantages. If Apple get through this with their developer popularity intact, I don't see the open/closed distinction as having any real-world importance at all.

The relevance to Linux? We're not going to succeed by being philosophically better. We have to be better in the real world as well. Ignoring that in favour of our social benefits doesn't result in us winning.

[1] It's slightly more legacy free than a "genuine" PC - there's no i8042, and things like the gate A20 hack aren't implemented. But it'll boot DOS (given enough effort), so hell, it's a PC

[2] And yes, I genuinely do think that the iphone's UI is better than anything else on the market. There's no reason someone else, including us, couldn't have got there first. But we didn't, and now everyone gets to play catch up. Shit happens

[3] I have no idea where Apple gets its UI engineers from, but someone needs to find the source and start waving huge piles of money to pick them up first.

Syndicated 2008-09-25 02:48:51 from Matthew Garrett

160 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!