Older blog entries for dan (starting at number 147)

Give me back my event loop

My new phone has now been emancipated, thanks in part to Atheer-pronounced-Arthur (or possibly vice versa) at the “Root” internet cafe on Edgware Road. They have an XTC Clip and he was able to do the S-OFF thing for me in about ten minutes and for the princely sum of £10. Recommended. I have been looking at AOSP build instructions, but actually doing the build and flashing the phone with a nice clean 2.3.4 Sense-free system will have to wait until I can devote a few more mental cycles to it.

In between the distractions of Shiny! New! Toy! I have been working on the projectr/thin-prefork pair – I am still reasonably convinced that they should be independent modules, though as I always seem to end up hacking on both at once I worry about the degree of coupling between them – to impose some sense on the interface for extending the thin-prefork server. Which I think is 80% there, but this morning I thought it was 100% there until I started trying to use it for real, so that’s a little bit annoying.

Which brings us to the rant/plea for today, as indicated in the title. Hands off my event loop! I’m sure I’ve said already this in other contexts and with regard to other platforms, but: I am not going to devote my process to call Oojimaflip.run! when there are other things it should be doing concurrently with watching for Oojimaflips, and I see no reason either to start a new thread (or process) exclusively for your use when you could have just written a method that says whether there are fresh oojimaflips and another to say what they are.

I am prompted to say this by rb-inotify, which is a (quite nicely written) wrapper around some kernel functionality that communicates via a file descriptor. I’d like a wrapper like this to (1) give me the file descriptor so I can call Kernel.select on it, along with all the other files I’m looking at; (2) give me a method which I will call when select says the fd is ready to read, which will read them and (3) digest them into beautiful Ruby-friendly Event objects. What I’ve got is about two out of three (good odds if you’re Meatloaf): there is a public method #to_io whose return value I can plug into select, there are beautiful Ruby-friendly Event objects, but to get those objects, unless I’m overlooking something (and I don’t mean that to sound passive-aggressive), I have to run one cycle of the rb-inotify event loop: call the #process method which calls my callback once per event, which has to find somewhere to store the events it’s passed, and then check the stored events when control eventually unwinds from #process and returns to me.

I’m actually being a bit harsh here, because the event-parsing code is there in the internals and not hard to grab. In the basement lavatory behind the “beware of the leopard” sign, I find a method called read_events, which if you don’t mind calling undocumented code can be used something like this. The preceding call to select would be better replaced by some code to put the file into non-blocking mode, but that’s a refinement that can wait for another time.

I have opened an issue on github saying something similar, which I expect is far more likely to have a useful effect than posting on this obscure blog. But, yeah, I like ranting.

Syndicated 2011-05-20 13:19:57 from diary at Telent Netowrks

Desire S: don't

If I had known a week ago about the lengths to which HTC are now going to prevent people from using their phones, I would have bought some other phone (like maybe the the LG Optimus 2X) instead – and if you are the kind of person who prefers to choose your own software than to let device manufacturers and mobile phone networks do it for you, I would recommend that you don’t buy it either.

That’s right, as far as I can determine it’s not (currently, at least) rootable. Finding this out is made harder than it needs to be because for any conceivable relevant search term, Google seems to prefer the xda-dev forum – by volume, 98% composed of script kiddies saying “OMG LOLZ” – over any results from people who know what they’re talking about. But here’s the summary:

  1. as delivered, there is no root access. Well, so far so perfectly standard
  2. there are a couple of promising-sounding exploits which work on other similar devices. The psneuter exploit – or at least the binary copy of psneuter of unknown provenance that I downlaoded from some ad-filled binaries site - doesn’t work, erroring out with failed to set prot mask. GingerBreak, though, will get you a root shell. Or at least it will if you get the original version that runs from the command line and not the APK packaged version that a third party has created for the benefit of xda-dev forum users.
  3. the problem is that GingerBreak works by exploiting bugs in vold and as the side-effect is to render vold unusable, you can’t get access to your sd card after running it. So, you think, “no problem, I’ll remount /system as writable and install su or some setuid backdoor program that will let me back in”. This doesn’t work although it looks like it did right up until you reboot and notice your new binary has disappeared.
  4. (incidentally, if you run GingerBreak again after rebooting the phone and it fails, it’s because you need to remove /data/local/tmp/{sh,boomsh} by hand.)
  5. The explanation for this freakiness is that /system is mounted from an eMMC filesystem which is hardware write-protected in early boot (before the Linux kernel starts up) but Linux doesn’t know this, so the changes you make to it are cached by the kernel but don’t get flushed. There is a kernel module called wpthis designed for the G2/Desire Z whih attempts to remove the write-protect flag by power-cycling the eMMC controller, but it appears that HTC have somehow plugged this bug on the Desire S. For completeness sake, I should add that every other mounted partition has noexec and/or nosuid settings, so /system is the only possible place to put the backdoor command.
  6. Um.

Avenues to explore right now: (1) the write-protect-at-boot behaviour is apparently governed by a “secu flag” which defaults to S-ON on retail devices but can be toggled to S-OFF using a hardware device called the “XTC Clip”, available in some phone unlocking shops. (2) Perhaps it is possible to become root without calling setuid by installing an APK containing a started-on-boot service and hacking /data/system/packages.xml so that the service launches with uid 0. (3) wait and see if anyone else has any good ideas. (4) study the Carphone Warehouse web site carefully and see if they have an option to return the phone for exchange, bearing in mind that I’ve been using it for seven days. Obviously those last two options are mutually incompatible.

Summary of the summary: HTC, you suck.

Incidentally, if you want to build the wpthis module for yourself there’s not a lot of useful documentation on building Android kernels (or modules for them) for devices: everything refers to a page on sources.google.com which appears to be 404. The short answers: first, the gcc 4.4.0 cross-compiler toolchain is in the NDK; second, the kernel source that corresponds to the on-device binary, at the time I write this, can be had from <http://dl4.htc.com/RomCode/Source_and_Binaries/saga-2.6.35-crc.tar.gz> (linked from <http://developer.htc.com/>); third, <https://github.com/tmzt/g2root-kmod/> doesn’t compile cleanly anyway: you’ll need to scatter #include <linux/slab.h> around a bit and to copy/paste the definition of mmc_delay from linux/drivers/mmc/core/core.h

Incidentally (2): <http://tjworld.net/wiki/Android> has lots of interesting stuff.

At this point, though, my advice remains that you should buy a different phone. Even if this one is rooted eventually (and I certainly hope it will be), HTC have deliberately made it more difficult than it needs to and why reward that kind of anti-social behaviour?

Syndicated 2011-05-15 13:10:27 from diary at Telent Netowrks

Testing a monolithic app - how not to

In the process of redesiging the interfaces to thin-prefork, I thought that if it’s going to be a design not a doodle I’d try to do it the TDD way and add some of that rspec goodness.

I’m not so proud of what I ended up with

There are a number of issues with this code that are all kind of overlapped and linked with each other, and this post is, unless it sits as a draft for considerably longer than I intended to spend on it, going to be kind of inchoate because all I really plan to do is list them in the order they occur to me.

  • The first and most obvious hurdle is that once you call #run!, the server process and its kids go off and don’t come back: in real-world use, any interaction you might have with it after that is driven by external events (such as signals). In testing, we have to control the external environment of the server to give it the right stimuli at the right time, then we need some way to look inside it and see how it reacts. So we fork and run it in a child process. (Just to remind you, thin-prefork is a forking server, so we now have a parent and a child and some grandchildren.) This is messy already and leads to heuristics and potential race conditions: for example, there is a sleep 2 after the fork, which we hope is long enough for it to be ready after we fork it, but is sure to fail somewhere and to be annoyingly and unnecessarily long somewhere else especially as the number of tests grows.
  • We make some effort to kill the server off when we’re done, but it’s not robust: if the interpreter dies, for example, we may end up with random grandchild processes lying around and listening to TCP ports, and that means that future runs fail too.
  • Binding a socket to a particular interface is (in Unix-land) pretty portable. Determining what interfaces are available to bind to, less so. I rely on there most likely being a working loopback and hope that there is additionally another interface on which packets to github.com can be routed. I’m sure that’s not always true, but it;‘ll have to do for now. (Once again I am indebted to coderr’s neat trick for getting the local IP address – and no, gethostbyname(gethostname()) doesn’t work on a mobile or a badly-configured system where the hostname may be an alias for 127.0.0.1 in /etc/hosts/)
  • We need the test stanzas (running in the parent code) somehow to call arbitrary methods on the server object (which exists in the child). I know, we’ll make our helper method start accept a block and install another signal handler in the child which yields to it. Ugh
  • We needed a way to determine whether child processes have run the correct code for the commands we’re testing on them. Best idea I came up with was to have the command implementation and hook code set global variables, then do HTTP requests to the children which serve the value of those global variables. I’m sort of pleased with this. In a way.

Overall I think the process has been useful, but the end result feels brittle, it’s taken nearly as long as the code did to write, and it’s still not giving me the confidence to refactor (or indeed to rewrite) blindly that all the TDD/BDD advocates promote as the raison d’embêter

The brighter news is, perhaps, that I’m a lot more comfortable about the hook/event protocol this time round. There are still bits that need filling in, but have a look at Thin::Prefork::Worker::Lifecycle and module TestKidHooks for the worker lifecycle hooks, and then at the modules with names starting Test... for the nucleus of how to add a custom command.

Syndicated 2011-05-11 15:56:25 from diary at Telent Netowrks

HTC Desire S, unlocked, new, £359

No, I’m not selling it at that price. I just bought it at that price. Amazon and Handtec are both advertising it at about £371 and other well-known online retailer (expansys, etc) at even more: what I did was notice that the O2 Store sell it for £349 plus £10 topup, and that Carphone Warehouse have recently introduced a Network Price Promise so even though their advertised price for the same phone and tariff is £399 they will give it to you for the £359 price if you insist. And because it’s CPW and not a tied shop, they will (well, most probably will, and certainly did in my case) give you the unlocked and unbranded handset. In fairness I should say I don’t know whether the handset in the O2 store would have been locked or not.

I’ve only had a few minutes to play with the phone so far so haven’t formed a strong opinion (HTC Sense may have to go …) on it yet, but it’s already very clearly an upgrade from my really rather elderly T-Mobile G1

Syndicated 2011-05-09 14:06:27 from diary at Telent Netowrks

Sinatra and the class/instance distinction

The Sinatra microframework is described as enabling “Classy Web Development”, and it turns out this is more literally true than I previously thought.

The Rack Specification says

A Rack application is an Ruby object (not a class) that responds to call. It takes exactly one argument, the environment and returns an Array of exactly three values: The status, the headers, and the body.

(emphasis mine). When you write a Sinatra app, though, it seems to want a class: whether you call MyApp.run! directly (we assume throughout this post that MyApp is a Sinatra::Base subclass) or use a config.ru or any other way to start the app running, there is a conspicuous lack of MyApp.new anywhere around. Yet the Rack spec says an app is an instance.

At first I thought I was being silly or didn’t understand how Rack works or had in general just misunderstood something, but it turns out not. Some ferretting through Sinatra source code is needed to see how it does this, but the bottom line is that MyApp has a class method MyApp.call which rack invokes, and this delegates to (after first, if necessary, instantiating) a singleton instance of MyApp stored in the prototype field. I am not at all sure why they did this. It may just be a hangover from Sinatra’s heritage and this stuff came along for the ride when Sinatra::Base was factored out of the Sinatra::Application classic app support. Or they may have a perfectly good reason (this is the hypothesis I am leaning towards and I suspect that “Rack middleware pipelines” is that reason). For my purposes currently it’s probably sufficient to know that they do it without needing to know why, and that I should stop trying to write Sinatra::Base subclasses which takes extra parameters to new.

:; irb -r sinatra/base
ruby-1.9.2-p0 > class MyApp  nil 
ruby-1.9.2-p0 > MyApp.respond_to?(:call)
 => true 
ruby-1.9.2-p0 > begin; MyApp.call({}); rescue Exception => e ;nil;end
 => nil 
ruby-1.9.2-p0 > MyApp.prototype.class
 => Sinatra::ShowExceptions 

Ta, and with emphasis, da! (The begin/end around MyApp.call is because for the purpose of this example I am too lazy to craft a legitimate rack environment argument and just want to demonstrate that prototype is created. And we should not be surprised that the prototype’s class is not the same class as we created, because there is middleware chained to it. In summary, this example may be more confusing in its incidentals than it is illuminating in its essentials. Oh well)

Syndicated 2011-05-04 12:39:05 from diary at Telent Netowrks

Preforking multi-process Sinatra serving (with Sequel)

Picture the scene. I have a largish Ruby web application (actually, a combination of several apps, all based on Sinatra, sharing a model layer, and tied together with Rack::URLMap), and I want a better way of reloading it on my development laptop when the files comprising it change.

At the same time, I have a largish Ruby web application (etc etc) and I wanted a better way of running several instances of it on the same machine on different ports, because running a one-request-at-a-time web server in production is not especially prudent if you can’t guarantee that (a) it will always generate a response very very quickly, and (b) there is no way that slow clients can stall it. So, I needed something like the thin command, but with more hooks to do stuff at worker startup time that I need to do but won’t bore you with.

And in the what-the-hell-why-not department I see no good reason that I shouldn’t be using the same code in development as is running in production and plenty of good reasons that I should. And a program that basically fork()s three times (for user-specified values of three) can’t be that hard to write, can it?

Version 0 of “thin-prefork” kind of escaped onto github and contains the germ of a good idea plus two big problems and an exceedingly boring name.

What’s good about it? It consists of a parent process and some workers that are started by fork(). There is a protocol for the master to send control messages to the workers over a socket (start, stop, reload, and basically whatever else you decide), and you subclass the Worker to implement these commands. This was found to be necessary, because version -1 used signals between parent and child, and it was found eventually and empirically that EventMachine (or thin, or something else somewhere in the stack) likes to install signal handlers that overwrote the ones I was depending on. And at that point I had two commands which each needed a signal and in accordance with the Zero-One-Infinity Rule I could easily foresee a future in which I would run out of spare Unix signals.

What’s not so good? Reloading – ironically, the whole reason we set out to write the thing. Reloading is implemented by having the master send a control message to the children, and the children then reload themselves (using Projectr or however else you want to). But when you have 300MB x n children to reload you’d much rather do the reload once in the parent and then kill and respawn the kids than you would have each of the kids go off and do it themselves – that way lies Thrash City, which is a better place for skateboarders than servers. (This would also be a bad thing for sharing pages between parent and child, but I am informed by someone who sounded convincingly knowledgeable that the garbage collector in MRI writes into pretty much every page anyway thus spitting all over COW, so this turns out not to be a concern at present. But someday, maybe – and in the meantime it’s still kinda ugly)

What’s also not so good is that the interaction between “baked in” stuff that needs to happen for some actions – like “quit” – and user-specified customizations is kind of fuzzy and it’s not presently at all obvious if, for example, a worker subclass command should call super: if you want to do somewthing before quitting, then obviously you should then hand off to the superclass to actually exit, but if you want to define a reload handler then you don’t want to call a non-existent superclass method when you’re done. But how do you know it doesn’t exist? Your worker might be based off another customisation that does want to do something important at reload time. So it’s back to the drawing board to work out the protocol there, though rereading what I’ve just written it sounds like I should make a distinction between notifiers and command implementations - “tell me when X is happening because I need to do something” vs “this is the code you should run to implement X”.

And why does the post title namecheck Sequel? Because my experience with other platforms is that holding database handles open across a fork() call can be somewhat fraught and I wanted somewhere to document everything I know about how Sequel handles this

Syndicated 2011-05-03 15:11:12 from diary at Telent Netowrks

Introducing Projectr

Why might you want to know the names of all the files in your project? One might turn the question around and ask why would you possibly would not want to, but maybe that’s not a constructive dialogue. So let’s list some use cases

  • to load them into your application
  • to load them into irb for debugging or for help in constructing test cases
  • to process them through rdoc
  • to put them in a gem
  • to print them (don’t laugh, I did that the other day when I was having trouble deciding how to refactor a bunch of stuff)

As far as I can see from my survey of the Ruby world, the current practices for each of these use cases are pretty ad hoc. Maybe you write a file full of require or require_relative statements (as the RBP blog author likes to do), maybe you use a glob, maybe you write a MANIFEST file, but there seems to be a significant lack of DRYness about it all. This led me to think there is a gap in the market for

  1. a language for describing the files that a project comprises
  2. some tools to interrogate a project description written in this form and find out what’s in it
  3. some code to load them into a running interpreter – and for bonus points, when files have previously been loaded into said image but since changed on disk, to reload them. This could be used in irb sessions, or could form the basis of a development-oriented web server that reloads changed files without needing to be stopped and started all the time

Note that item 3 above gives us something that “file containing list of require statements” doesn’t, because it allows us to reload files that we’ve already seen instead of just saying “meh, seen it already”. If you’re using a comparatively low-powered machine then reloading your entire app in irb every time you change a method definition is unnecessarily and obviously slow. If you’re also using Bundler (which I rather like now i’s settled down a bit, and will write more about in a future entry) then the additional bundle exec is not just slow, it’s SLow with a capital S and a capital L and a pulsating ever-growing O that rules from the centre of the underworld.

Here’s one I made earlier

Projectr::Project.new :test do
  # directories may be named by symbols or strings
  directory :example do
    #as may files
    file "file1"
    file :file2
    directory "subdir" do 
      file :subdir_file
    end
  end
end

h=Projectr::Project[:test]
h.load!   # it loads all the files
# and again
h.load!   # nothing happens this time
# touch example/file1.rb
h.load!   # loads only the changed file

At the time of writing this, the github version does about that much, but is quite clearly still version 0. Stuff I am still thinking about:

  • Load-order dependencies. Lisp programmers may recognise that Projectr was inspired by using (and indeed implementing a version of) defsystem (or more recently here) but Projectr is almost minimally featured compared to any of the Lisp-based defsystem facilities. Many of those features I don’t have any strong evidence that the Ruby world would find use for, but load-order dependencies allow us to say for example that if file A defines a DSL and files B and C use that DSL, changing A should make the computer reload B and C as well
  • It seems clear to me that defining a project and loading it are two separate operations – you may wish instead to define it and then generate a Gemspec, for example – but there’s still a lot of verbiage in the common case that you do want to load it, and I haven’t really found file layout and naming conventions that I feel good about
  • likewise, what happens when we redefine the project itself (as would happen if we want to add a file to it, for example) is slightly up for grabs. Should the project definition file be considered a part of the project?

I will doubtless form my own opinions on all of these issues in time and with more experience of using this tool in practice, but feedback on them and on the general approach is warmly welcomed.

Fork, clone, spindle, mutilate

Syndicated 2011-05-02 09:39:18 from diary at Telent Netowrks

I did it my way

short unoriginal observation on ruby blogging engines: quicker to write your own than evaluate all the other poorly documented ones
… this observation only holds if you skimp on the documentation of course. which is where we came in

If you can see this, you can see my blog design all changed again. This time it’s a Ruby Sinatra application (whence the name my-way) running on thin-prefork, which keeps the article texts in git and uses RedCloth plus some ugly regexps to turn them into HTML. The Markdown vs Textile decision is not an especially interesting one in the first place, but gets a lot easier still when you have something like 9 years worth of previous articles in Textile format.

Publishing is achieved by pushing to a git repository on the live machine (a Bytemark vm). A post-update hook in the remote repository is responsible for checking out the updated commit (git doesn’t like pushing to non-bare repositories) and sending SIGHUP to the running instance of my-way which causes it to reindex files.

dan@bytemark:~$ cat /home/git/my-way.git/hooks/post-update        
#!/bin/sh
GIT_WORK_TREE=/home/dan/src/git/my-way git checkout -f
kill -1 `cat /tmp/my-way.pid`

The version of my-way on github lags the actual version slightly, because I need to separate the engine from the articles and from the config data (there are things like adsense subscriber id, flickr api keys, etc) before I push the latter to a public service. Will clean it up in the next few days.

And my apologies to RSS feed subscribers. I’ve finally dropped the /diary prefix on the URL for this blog, and the old RSS feed didn’t use GUIDs and I’m too lazy to make the new one do so either, so the upshot is you just got the ten most recent articles in your feed again. Sorry.

Syndicated 2011-05-01 10:50:35 from diary at Telent Netowrks

What I miss most about Lisp

It’s been three months since I wrote anything longer than one line in Lisp, and over a year since I wrote more than a screenful of the stuff.

What I miss most is not CLOS or the REPL or even macros (per se, anyway). It’s

  • the distinction between READ and EVAL: a sane syntax for constructing complex data structures that look like code, but without actually having the data structures in question interpreted
  • and backtraces with the values of function parameters in them. When you’re doing the same thing 1000 times to different database rows or objects in a collection, and one of them has a nil in it somewhere, it would be really nice to know which one.

And maybe the REPL (although irb does most of that). And kinda sorta Defsystem, but I seem to be in the process of reimplementing that

But I hope soon to get Rubinius installed, just because I still have the irrational opinion that a grown-up programming language ought to be able to implement itself (and I have a thing for native code) so project 1 there is to see if I can hack the backtrace thingy at least into it.

Syndicated 2011-04-30 22:23:33 from diary at Telent Netowrks

TDD, BDD, executable specification

The new system at $WORK finally went live about a week ago, hurrah.

The upgrade itself took a few hours longer than I'd have liked, and (short shameful confession time), some (but probably not all) of this could have been caught by better test coverage. Which set me on a path towards SimpleCov (I'm using Ruby 1.9, rcov doesn't work), which led me to start looking at the uncovered parts, which set me to thinking. Which, as we all know, is dangerous.

TDD advocates (and pro-testing people in general) say "Don't test accessors". There are two reasons to say this that seem to my mind like good reasons: Ron says it because he wants you to write tests that do something else (something useful) that happens to involve calling those accessors. J B Rainsberger says it because "get/set methods just can't break, and if they can't break, then why test them?"

The problem comes when you adopt the mindset typified by BDD that "the test examples are actually your executable specification", because in that case how do you specify that the object has an accessor? This is not an unreasonable demand. Suppose we have objects whose purpose is to store structured data that will be used by client code - for example, User has an age property. Jbrains - which must surely be the Best Nickname Ever for a Java guy - says there's no useful test you can write for this (or not unless you don't trust your platform or something, but that way madness lies). But even if we are going to write one: a test that stores one or a few example values can easily be faked by the bloody-minded implementor ("the setter is called with the argument 42, then the getter is called and should return 42? I know! def age; 42; end") and a test that stores all possible values and tests they can be retrieved will take forever to write/read/run. Really the best notation in which to specify the behaviour of that property is the same notation which, when run by the interpreter, will implement the said behavior -

    attr_accessor :age
It's not just accessors either. Everything on the continuum between declarations of constants (SECONDS_PER_DAY=86400, are you really going to write a test for that?) and simple mathematical formulae
    class Triangle
      def area
        self.base * self.height / 2.0
      end
    end
are most readably expressed to humans as, well, the continuous functions that they are, not the three or four example data points that we might write examples to test for. For any finite number of test cases, you can write a giant case...when statement that passes all of them and still doesn't work in the general case.

Yes, we could and often should write a couple of tests just to make sure we haven't done anything boneheaded in implementing the function, but they're not spec. They're just examples.

But here's the rub: where or how do we put that code to make it obvious that it's specification that happens also to be a valid implementation - and not just implementation that may or may not meet a spec expressed in some other place/form? If we're laying out our app in conventional Ruby style, it can't go in the spec/ directory because that doesn't actually get run as part of the application, and it shouldn't go in lib/ or models/ or wherever else unless we are prepared to make our clients rootle through all that code looking for whatever "this is specification not just an implementation detail" flag we decide to adopt) when they want to use our interface.

I'm going to make a suggestion which is either radical or bone-headed: we should smush the rspec-stuff together with the app code: embed examples (which may in some cases be specification and in other cases be "smoke tests") in the same files as the implementation (which may itself sometimes be specification and other times be the result of our fallible human attempts to derive implementation from spec), and then we can have some kind of annotations to say which is which, and then we can have some kind of rdoc-on-'roids literate programming tool (To Be Implemented) go through the whole lot and produce two separate documents. One for people who want to use our code and want to know what it should do, and the other for the people who have to hack on it and need to know how it does. Or doesn't. And then maybe we can have code coverage metrics that actually mean something.

Syndicated 2011-04-07 14:26:57 from diary at telent netowrks

138 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!