Older blog entries for StephanPeijnik (starting at number 29)

What's all the fuzz about canonical-census?

I know I have not updated this blog in quite a long time now, but something caught my attention today: canonical-census.

As slashdot.org reports Canonical begins with tracking their (OEM) installations. Now it's obvious that people are uncomfortable with a program running on their system which phones back to their OS vendor, that's why I have had a quick look at what exactly canonical-census does.

Firstly however, I would like to point out that the report on slashdot.org is very clear about which information is being gathered, being "the number of times this system previously sent to Canonical [...], the Ubuntu distributor channel, the product name as acquired by the system's DMI information, and which Ubuntu release is being used". And it's perfectly correct. After getting the canonical-census Debian source package (using dget -u https://launchpad.net/ubuntu/+archive/partner/+files/canonical-census_0.1.dsc) the source package shows, besides the Debian packaging information, two scripts:

  • census (written in Python) and
  • send-census (a GNU bash script).
Now what do those scripts actually do?

send-census is installed in /etc/cron.daily, which means it will be executed once a day by the system's cron daemon. It's a mere 48 lines long, and its code is quite simple. So everyone with at least some shell scripting experience can easily check what it's doing. Now guess what, it sends exactly the information as reported on slashdot to Canonical. Nothing more and nothing less.

Technically it keeps a plain text file containing a single number as its call-counter, residing in /var/lib/send-install-count/counter and uses an on my Ubuntu Lucid system nonexistent /var/lib/ubuntu_dist_channel file for getting information about the distribution channel.
The above mentioned "system's DMI information" is not the whole bunch of DMI information available, but only the contents of /sys/class/dmi/id/product_name, which strangely enough returns "System Product Name" on my machine. Last but not least it uses lsb-release to get the distribution release (ie. 10.04 for my system).

Now those four pieces of information are sent to http://census.canonical.com/submit via a simple HTTP GET query, using wget. The full URL with all the parameters added is:
http://census.canonical.com/submit?count=count&dcd=dist_channel&product=dmi_product_name&release=ubuntu_release_version

The second script, census, is the part working on Canonical's script. Basically census reads in their Apache's access log file and creates an SQLite database from the contents of the log file. With 391 lines this script is a bit longer, but it does not end up in the Debian package at all.

Personally I do not see how Canonical or one of their partners could possibly do anything harmful with that information. Comparing this to Debian's popcon reveals that Debian is gathering a lot more information.

Now there are two more things one should consider: census is targeted at OEMs, which means its unlikely that it will end up on each and every Ubuntu installation and can be uninstalled by removing the canonical-census package with your favorite package manager.

Finally, think about this for a second: It's a shell script you can always examine. There is no hidden magic and it's a plain HTTP request the script is sending. No evil things happening there.
And now compare that to what other (often proprietary) software vendors do and how much data they submit, possibly even in encrypted form so you do not know for sure what is being sent to them.

Personally I welcome the openness of Canonical with providing their users with the package's code this early and being straight about what information it submits. They could have silently added it to those installations after all...

Happy hacking!

Syndicated 2010-08-10 12:07:00 (Updated 2010-08-10 12:07:35) from sp

Rest in peace Flo: 13.11.1986–16.12.2009

Today is a sad day. Everything feels like I am having a bad nightmare. That's because today I learnt from the too early death of my friend Florian Hufksy.

I am sitting here and do not really know what to write. I keep on thinking about the great times we spent together. The time we started programming when we were twelve. The time we spent learning BASIC. All the times you knew more than me and could teach me a thing or two. I remember our geek talks. How we would discuss latest games. How we lost contact and how we met again. I am thinking about how sorry I am for not having met you often enough. I keep on trying to understand what drove you that far. How you could just end it all. More and more memories come to my mind, like the moment when you showed me one of your projects, Super Mario War. The moments we had playing video games together. All those moments, all that time, I miss you my friend. You were a genius, always a step ahead, not only of me, but seemingly the whole world. I can't stop thinking about your brilliant ideas and how you always finished your projects. You were a real hacker, a real genius, a person trying to make the world a better place, a person who will be missed, not only by me.

You were a genius and I always respected you, not only as a hacker, but as a beloved friend. Why did we not spend more time together? Why did you have to go? Why do I have to write this now, sitting here in my chair with tears in my eyes? And all those memories come up again and again. There is so much more that comes to my mind, but I can't keep on writing, it just hurts too much.

The world is a sad place today. I am sad. I am mourning the too early death of my beloved friend, Florian. You will always have a special place in my heart.

Syndicated 2009-12-19 23:15:00 (Updated 2009-12-19 23:15:48) from sp

ujail: use cases, FAQs, part 1 & proof of concept, part 2

As I ran out of time whilst writing the "introducing ujail" post on monday I would like to further elaborate on the idea, giving you some examples of possible use cases and then having a look at FAQs regarding ujail. Additionally I have created a second proof of concept that should be a lot faster, see below for more details.

Use cases of ujail

Monday's post was rather technical, so let's have a look at possible use cases today.

The main reason for both having the idea of ujail and starting working on it is my web server. I am running quite a few (S)CGI scripts there and, even though running them as different users, on a per-vhost basis, I have the impression of the whole thing being a bit insecure.

Okay, PHP does provide its famous open_basedir feature, but I am also running some Python applications which I simply cannot restrict easily. My first ideas involved adding something similar to open_basedir to Python, followed by the idea of replacing some C library functions, like fopen and friends on startup time.

Whilst the adding open_basedir to Python would have involved changing a lot of Python's internals I soon discarded the library patching idea as those could be worked around by injected code directly invoking syscalls. It didn't take long for me to notice that I have to dig deeper. The idea of ujail was born and after coming up with the proof of concept this seems to be a viable solution.

Now ujail is not only about protecting a web server from its web applications, but could do a lot more, for example:

  • Creating a sandbox for untrusted code (socket&file i/o emulation)
  • Implementing some sort of personal firewall (socket-call only emulation)
  • Testing applications that perform low-level system operations (read: package managers and friends, filesystem emulation)
 I am sure you can come up with even more use-cases. What should be noted is that emulating a system call does not mean that one necessarily needs to emulate the whole filesystem. What can be done, for example, is patching through access to common files (libraries, executables, etc.) whilst maintaining a virtual filesystem for data that will eventually be modified. A copy-on-write approach is possible too, for example. There are multiple methods with which the multiple filesystem could be implemented, the most common would probably be using a state directory.

FAQs

There have been some questions about ujail in comments to my first post which I would like to answer. Also, I have been thinking about things that are different about ujail compared to other virtualization techniques. Feel free to add additional questions either in a comment or drop me an email: debian at sp dot or dot at.

  1. Could you change the license of ujail to ... ?

    Not likely to happen. The proof of concept's license is GPLv3 and the actual code's license will be too. However, ujail is a userspace application that does not need any modifications to the kernel so there should be no problems with porting ujail from GNU/Linux to any other system.

  2. Does ujail work on operating systems other than GNU/Linux?

    Not yet. If it's technically possible to implement the technique on other operating systems I would be happy to accept patches.

  3. Do I need to patch my kernel for ujail to work?

    No, ujail is running in userspace. The only thing it needs is Linux with support for PTRACE_SYSEMU.

  4. How is this approach different from using LD_PRELOAD?

    With LD_PRELOAD one can replace library functions, but malicious code could still directly invoke syscalls, working around this protection completely. Also, statically linked binaries cannot be restricted with LD_PRELOAD.

  5. How is this approach different from user-mode-linux?

    User-mode-linux (UML) works by emulating a full kernel in userspace and allows you to virtualize a whole Linux instance (including a new init process, etc). ujail is about providing a way of restricting a single process (and its childs) inside a running system in terms of access to syscalls and the partial emulation of those.

  6. How is this approach different from linux-vserver?

    Linux-vserver is a kernel patch and runs in kernel space, as opposed to ujail, which works in userspace.
    Also, linux-vserver works similarly to user-mode-linux, providing a fully virtualized Linux instance.

  7. Does the account running ujail need any special privileges?

    No, the only restrictions that apply are those of ptrace.

  8. Where is the code?

    Right now ujail is in a planning phase, and only the proof of concept code has been written and published. The actual ujail code is yet to be written and the code will be hosted on launchpad.net.

Proof of concept, part 2

An anonymous person (who were you stranger?) added a comment to my first post, suggesting "Also, why patch the process rather than just modifying its state and trapping into the kernel?". I have had a look at this approach earlier, but it didn't work out. However, I decided to give it yet another try and created a second proof of concept. That code does not require patching any code, but only modifies the instruction pointer (eip) and the first register (eax). This should be a lot faster than patching the code.

Technically the new main loop works by calling PTRACE_SYSEMU and waiting for a notification. It then saves the instruction pointer and switches to PTRACE_SYSCALL. As before it waits for the emulated syscall to exit and at this point sets eax from orig_eax and decreases the value of the instruction pointer by the size of the "int $0x80" instruction. Another call to PTRACE_SYSCALL resumes the process. The next event is the process actually entering the real syscall and yet another one leaving the syscall again. These are resumed by PTRACE_SYSCALL and PTRACE_SYSEMU respectively. So, comparing this with the first approach we are only modifying two registers now, instead of writing to the TEXT area of the running process.

Thanks should go to the anonymous commenter for making me give this approach another try.

Questions? Criticism? More ideas? Want to contribute?

Coming to an end I would yet again like to let you know that I am  open for questions, criticism, more ideas and contributions in general. So if you are interested in this topic come join the discussion by either dropping me an email, writing a comment to this post or replying to this post on your own blog.

Syndicated 2009-12-09 10:16:00 (Updated 2009-12-09 10:16:46) from sp

Introducing ujail & proof of concept

Lately I have been thinking about methods to provide a stripped down, secured environment for running untrusted code on GNU/Linux. With this post I would like to present you with the first results of my research.

ujail - brief introduction

I have chosen ujail as the name for the technique I am proposing. ujail stands for micro jail in userspace and, in itself, describes the concept briefly. The main idea is to have a userspace process monitor system calls of one of its childs and emulate some calls, if needed. This is done using ptrace and namely both PTRACE_SYSEMU and PTRACE_SYSCALL.
The ujail process should not be able to monitor syscalls, like strace does, but also intercept and emulate them.

This sounds a lot like user mode linux (uml), but the method is different. Whilst uml comes with a complete kernel, emulates all system calls and this way provides a virtualized system, ujail is intended to only emulate some systemcalls, without emulating the kernel.

Revisiting PTRACE_SYSCALL & PTRACE_SYSEMU

To better explain how the ujail technique works I would like to have a quick look at PTRACE_SYSCALL and PTRACE_SYSEMU again.

PTRACE_SYSCALL allows a userspace process to be notified whenever a traced process enters or leaves a system call. This means that two notifications are normally sent: one before system call entry and one afterwards. Even though one is able to change the parameters of system calls this method does not allow system calls to be fully emulated (think virtual filesystem here).

PTRACE_SYSEMU on the other hand provides one notification on syscall entry and expects the receiver of the notification to emulate the syscall. This method alone sounds great, but this also means that memory allocation needs to be emulated too, which is quite complex in userspace.

A hybrid of PTRACE_SYSCALL & PTRACE_SYSEMU

Now on to the concept behind ujail. The method I am describing works by calling PTRACE_SYSEMU for a specific process and this way taking over emulation of all system calls. However, some system calls are complex to emulate in userspace, and so a hybrid of both PTRACE_SYSEMU and PTRACE_SYSCALL is needed. In short this works by checking whether the syscall needs to be emulated when the PTRACE_SYSEMU event is received.
Now one way is emulating the syscall, filling the processes' registers and resuming execution of the process. This is simple and straight-forward.

The second way is forwarding the system call to the kernel. The problem here is that calling the syscall in the monitoring process will make the new resources available to that very process, and not the process to be jailed. This is where the hybrid method kicks in.

The proof of concept code creates a backup of the next instruction to be executed along with a copy of the instruction pointer at this point and patches it with the opcodes for "int $0x80", causing the syscall to be made again. After that it resumes execution with PTRACE_SYSCALL and waits again. The first event to be received now is the program leaving the emulated system call, which can be ignored. Resuming yet again will give use two PTRACE_SYSCALL events, one for syscall entry and one for syscall exit.

The first event is not really interesting, but at the second event the opcode backup is restored and the eip set from the saved value. Now the kernel has handled the syscall and the result is ready for the child process. A final call of PTRACE_SYSEMU resumes execution of the child and waits for the next syscall.

Proof of concept

The proof of concept code can be downloaded from its bazaar branch at launchpad.net. It is intended to be used on i386 systems only and works with simple programs, but is known not to work with anything using fork, vfork and most likely will not work for binaries using threading.

Finally, I would like to thank Pradeep Padala for his "Playing with ptrace" articles [0][1], which were fun to read and worked as a great introduction of ptrace for me.

Now there is only one thing left to say: if you are interested in this method, see loopholes or problems or want to contribute, please go ahead and contact me:

debian at sp dot or dot at

Syndicated 2009-12-07 16:49:00 (Updated 2009-12-09 10:17:25) from sp

How to copy partitions under GNU/Linux the easy way

After getting a new disk for my Popcorn Hour A-110 device I had to copy all partitions from the old disk onto the new one so I do not have to reinstall some applications and reconfigure everything.

After searching the web and trying to find a free alternative to Norton Ghost and Acronis True Image, preferably not using a boot disk on its own (I did not want to backup my workstation after all, just a simple partition to partition copy between two SATA disks) I gave up and decided to do the copying manually.

So I fired up gparted to do the partitioning, did a right click and... I noticed that gparted supports copy/paste. Being curious about what this could potentially do I gave it a try. I marked partition one on the old disk, did a copy, went to the new disk and clicked on paste - and guess what, gparted did what I was looking for.

Putting a long story short: you can copy whole partitions using gparted's copy/paste mechanism and even resize them whilst doing so. I am somehow ashamed I did not notice this feature earlier, having been a gparted user for a few years now and I can imagine I am not the only one who missed that.

Syndicated 2009-12-01 17:22:00 (Updated 2009-12-01 17:23:47) from sp

kvm, qemu and the magic of ubuntu-vm-builder

As I noted two days ago I was unable to build Android on Ubuntu 9.10 x86-64 and thus needed to set up a virtual machine.

At first I went for my preferred virtualization solution, VirtualBox and had to notice that even though I assigned all 4 processor cores of my workstation (along with 2GiB of memory) to the virtual machine building was painfully slow. I immediately ditched the idea of using  VirtualBox again and decided to give something new to me a try: the combination of kvm and qemu.

Having Intel VT-x support built into my workstation's processor I thought that this combination should give better performance, and I wasn't disappointed. To be honest, I am astonished on how fast the beast is now. Disk speed still seems to be not as fast as running things natively, but there must be a downside somewhere. :-)

After a bit of googling I also found that ubuntu-vm-builder exists, which simplifies virtual system creation tremendously.

My Android working tree is being synchronized right now, which means that I should be able to start building in a few minutes time. I hope the virtual machine stays as fast as it is right now during the build and I hope everything goes well.

Syndicated 2009-11-11 08:25:00 (Updated 2009-11-11 08:28:55) from sp

An update on the proprietary Maemo SDK installer

Yesterday I wrote about my dissatisfaction with the current state of the Android 2.0 code tree and how a proprietary install script for Maemo scared me off.

As suggested in one of the comments to my post I filed a bug report against Maemo, bug 6087.

Besides getting quite a few replies to the bug report within a matter of hours Carsten Munk pointed me at Maemo SDK+, which has less restrictive licensing.

Another comment, by Marius Gedminas (thanks!) pointed me at Mer,

a new operating system for small, mobile touch-screen devices.

It is Linux based and layers the best open-source elements of Nokia's Maemo platform over a modern Ubuntu base.

The goals of Mer include:

  • Integrate the best solutions for a wide variety of small form-factor devices
  • Encourage wider access to device capabilities through the Vendor Social Contract
  • Demonstrably provide an easy route to market for vendors
  • Dramatically reduce costs to vendors of supporting EOL hardware
  • Focus, harness and support community contributions to the platform
  • Encourage and ease migration of existing applications
  • Support experimentation, innovation and development

Syndicated 2009-11-09 21:00:00 (Updated 2009-11-09 21:09:58) from sp

My Android repositories

As I wrote in my last post I noticed a few problems with Android's roaming detection code and decided to try fixing it myself.

So, I am basing my work on CyanogenMod, which I am also using on my Android device. My repositories are hosted at github.com/speijnik and you can fetch (nearly) everything you need for building by using repo. See the README file in my android repository over at github for details.

For now only the simplification of the roaming detection code has made it into the repository, but be aware that even though I have published the code I still have neither built nor tried it, as I do not have a working build environment set up yet.

Oh, about the working build environment: there seem to be problems with either the webkit code in the Android repositories (unlikely) or with building that code on Ubuntu 9.10 x86-64 (more likely). Right now I am downloading Ubuntu 8.04 LTS i386 for use in a virtual machine. I will let you know whether that fixes my problems or not.

Syndicated 2009-11-09 20:58:00 (Updated 2009-11-09 20:59:29) from sp

Android's roaming detection & its implementation

I know I wrote about Android already today, but there is another thing that concerns me right now. I am owner of an Android-based phone (an HTC Dream) and recently switched my mobile network provider. The problem is that my new provider is a virtual provider and as such there is no real network of that provider. Now Android has a feature to turn off broadband connections when in roaming mode, which itself is a great idea and can save you from paying quite a lot of money when the phone connects to 3G abroad, but this feature also turns off broadband connections when roaming locally. All this is being discussed in bug report #3499.

After noticing this problem I became curious on how Android detects that it is roaming and I found the GsmServiceStateTracker.isRoamingBetweenOperators method to be responsible for that magic, but soon noticed that the method is not only inefficient, but also doesn't work as intended. This is hardly related to the bug mentioned above, but let's have a look at the code in question:

/**
* Set roaming state when gsmRoaming is true and, if operator mcc is the
* same as sim mcc, ons is different from spn
* @param gsmRoaming TS 27.007 7.2 CREG registered roaming
* @param s ServiceState hold current ons
* @return true for roaming state set
*/
private
boolean isRoamingBetweenOperators(boolean gsmRoaming, ServiceState s) {
String spn = SystemProperties.get(PROPERTY_ICC_OPERATOR_ALPHA, "empty");

String onsl = s.getOperatorAlphaLong();
String onss = s.getOperatorAlphaShort();

boolean equalsOnsl = onsl != null && spn.equals(onsl);
boolean equalsOnss = onss != null && spn.equals(onss);

String simNumeric = SystemProperties.get(PROPERTY_ICC_OPERATOR_NUMERIC, "");
String operatorNumeric = s.getOperatorNumeric();

boolean equalsMcc = true;
try {
equalsMcc = simNumeric.substring(0, 3).
equals(operatorNumeric.substring(0, 3));
} catch (Exception e){
}

return gsmRoaming && !(equalsMcc && (equalsOnsl || equalsOnss));
}

Okay, let me summarize what this piece of code does wrong, at least from my understanding:

  • It takes both the network operator alphanumeric identifier and alphanumeric long identifier and compares both to the alphanumeric identifier coming from the SIM card, whilst...

  • ... it could simply use the network and SIM card numeric identifiers and compare those, which should be a lot cheaper than comparing those strings

  • Then it takes the first three characters/digits of the numeric identifiers (which indicate the country) and compares those


Now in my case my SIM card doesn't seem to provide the phone with a alphanumeric identifier, so the first two comparisons always fail for obvious reasons and, looking at the inline-if in the last line of that method my phone will always indicate that I am in roaming mode, even when I am not.

The problem is not only the logic which seems to be wrong, but I rather see the inefficient comparisons used there to be a major problem in embedded systems like mobile phones. This is the first piece of Android code I have had a look at, but if all other code is as ugly and inefficient as these few lines Android really needs some major fixes. Related to this I have reported bug #4590 and forked the git repository in question over at github, to fix this method, should be a matter of 5 minutes.

Syndicated 2009-11-08 20:04:00 (Updated 2009-11-09 20:21:28) from sp

Android, Mythbusters and openness

I have been reading a great many posts about Android lately, some consisting of criticism, some of praise and some simply addressing issues in the Android "community". Let's have a look at those.

Matt Porter's Android Mythbusters presentation and Harald Welte's reaction


I haven't seen the presentation live, but I had a look at the slides. Impressing work done by Matt putting all this information together. However, we all knew that Android only (ab-)uses Linux, without making use of the GNU userland for a long time, didn't we?

In his presentation Matt has shown things such as Android's udev "replacement" that uses hardcoded values for device node creation and (on his blog) Harald has then come up with a statement I have found to be very strong:

The presentation shows how Google has simply thrown 5-10 years of Linux userspace evolution into the trashcan and re-implemented it partially for no reason. Things like hard-coded device lists/permissions in object code rather than config files, the lack of support for hot-plugging devices (udev), the lack of kernel headers. A libc that throws away System V IPC that every unix/Linux software developer takes for granted. The lack of complete POSIX threads. I could continue this list, but hey, you should read those slides. now!

Now both of these statements target technical details, but the root of the problem seems to be elsewhere.

Where is my Android 2.0?

Okay, that heading might not be making any sense in the context of this post at a first glance, but let me elaborate on that. Google and the Open Handset Alliance refer to Android as being an "Open Source" operating system, but the project is different from "real" Free Software projects: development takes place in a closed group and the results are shared with the community later on, when they are deemed to be ready.

This means that innovation also takes place behind closed curtains and that the community is not involved in the actual development process at all. Lately we have seen the result of that, as Motorola is bragging about working close with Google on Android 2.0 ("Eclair"), but the AOSP source trees, open for everyone to have a look at, show no signs of version 2.0. In fact no changes that might even remotely suggest the release of a new major version have been made public in the past few weeks. So where is the openess there?
Actually, the Motorola Droid has already shipped with Eclair on 6th, but still, there is no indication that Eclair will be made available to the broader public.

In short Android seems to be developed behind closed curtains, with hardly (read no) community input whatsoever and is sometimes released as Free Software, not what I would describe as an open development process.

The Android Market problem

As we have seen in the past Google is enforcing their copyright on proprietary applications that ship with pretty much every Android device, such as the Android Market. This has become really clear when Steve Kondik received  a cease and desist letter when packing the Google-proprietary applications into his ROMs. Okay, it's Google's right to enforce their copyright and there is nothing wrong with actually doing so, the thing I really have a problem with is something else: the Market is proprietary.

Now what this means should become rather clear. You can have an Android device without Google's proprietary bits, but with default settings you just do not have any way of installing additional software. In my opinion the Market should be freed by Google themselves, or the community has to react and come up with a free replacement to overcome the vendor lock-in. Oh, you might know a replacement called SlideMe (or Mobentoo) already. Well, that bugger is proprietary too, so not a solution at all.

Nokia and Maemo to the rescue

In most discussions about the openness of Android someone throws in Nokia and Maemo, as a solution to the dilemma. Reading all those positive comments I simply had to give it a try, but all my hopes were destroyed within a few minutes.

Let's start with the good news and let alone the reason why my hopes were destroyed for another minute or two. Maemo is based on Debian GNU/Linux and various Free Software components, such as GTK+, gstreamer, esd and friends. Most of the system is Free Software which is a good thing(tm) and reading all of this really got me into Maemo. Okay, some applications seem to be proprietary, but I am sure that could be fixed rather easily, so I could once for all use a truly open phone.

...and then came the SDK installer shell script:
#!/bin/sh
# Copyright (C) 2006-2009 Nokia Corporation
#
# This is proprietary software owned by Nokia Corporation.
#
# Contact: Maemo Integration <integration@maemo.org>
# Version: $Revision: 1110 $

Now there is one question you should ask yourself: Why would someone trying to promote his platform as being open make the *installer* script for its SDK proprietary? Come on, it's an installer script, how much of your secret juice could be in there? What's the problem with people modifying it and working on this installer script in an open development environment?

I had high hopes for Nokia actually doing a bit better than Google, but it seems they've failed to do so. It may be me overreacting, but a proprietary SDK installer shell script scares me enough not to install the SDK and have a look at it for now nor to think about buying a Maemo-based device in the near future. Please Nokia, either get the facts straight or provide us with a free SDK to your free & open platform.

So, in short, Google is bad at working with the community and creating a truly open development process, and Nokia simply fails in terms of not scaring off prospective developers for their open platform with the proprietary SDK installer. Do you have any solutions in terms of an open phone environment, apart from what OpenMoko has come up with?

Syndicated 2009-11-08 04:49:00 (Updated 2009-11-09 20:21:28) from sp

20 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!