Older blog entries for Bram (starting at number 48)

8 Dec 2002 (updated 8 Dec 2002 at 06:23 UTC) »
ladypine: Viewing cost and benefit between individuals as somehow comparable is dubious. For one thing, it's completely gameable - individuals who exaggerate costs and benefits to themselves, or are just plain histrionic, are rewarded unduly. For another, social policies based on such theories tend to be downright racist, since they heavily slant towards whatever benefits the gender/race/whatever is dominant most cares about.

A much simpler and more justifiable approach is to try to reach pareto efficiency, which is a situation in which no two individuals can make an exchange and both be happier. This cleverly avoids making subjective comparisons of different peoples's worth, and also yields a straightforward algorithm for maximization. The success of capitalism can be viewed as a testament to the robustness of greedy algorithms.

robocoder: I suggest setting up CVS, a mailing list, syncmail, and a todo list. Picking a bug tracking tool to begin with is like starting the construction of a bridge by digging a mass grave for everyone who will die in the building process.

There's a very difficult puzzle in the latest scientific american, it goes as follows:

Three of the nine members of the president's cabinet are leaks. If the president gives a tip to some subset of his advisors and all three interlopers are in that subset, then that tip will get leaked to the press. The president has decided to determine who the leaks are by selectively giving tips to advisors and seeing which ones leak. How can this be done by giving each tip to three or four people, having no more than two leaked tips, and using no more than 25 tips total?

There's a solution on the web site, although the one I figured out is very different.

Here's my solution. First, note that if a tip to four people leaks, the leaks can be found using only three more tips, giving each of them to three of the four.

Arrange eight of the nine advisors on the vertices of a 2x2x2 cube. Test each of the 12 subsets which are coplanar, plus the 2 subsets which are all the same color if they're colored like a checkerboard. If any of those leak, we're done. If not, we've determined that the odd one out must be one of the leaks.

Next, arrange the eight we previously had in a cube formation into a line, and test all 8 configurations of the form the odd one out, x, x+1, and x+3 (modulo 8).

If none of those hit, then the two other leaks must be of the form x and x + 4, which there are four possibilities for. Three of them can be tried directly and if none leak then the last one can be inferred.

This leads to a total of 14 + 8 + 3 = 25 tips. It's a very hard problem, it took me about 45 minutes to figure out a solution.

An additional question given is whether increasing the number of advisors included in a tip can reduce the number of trials necessary. The answer is yes. For example, to modify the technique I gave the first test could be of six of the corners of the cube except for two adjacent ones. This effectively does three tests at once, so the total number of tests needed drops to 23. If the first test turns up positive, all but one of the 20 subsets of 3 of 6 can be tried individually, and if none of them leak then the last one can be inferred, for a total of one initial tip with to 6 plus 19 others, or 20 total, so leaking on the first tip isn't the limiting case.

The CodeCon submissions deadline is December 15th. Get those submissions in.

The time and place have been set. It will be February 22-24 at Club NV in San Francisco.

I just finished my first game of go on a 19x19 board. I played against a 15kyu player, and lost by about 100 points (I should have lost be about 80, but blundered stupidly near the end). I literally feel ready to throw up. I'm very kinaesthetic and apparently my body confuses playing a position I don't understand at all with being dizzy. Hopefully this will stop happening in the future.

19x19 go is huge. Probably equivalent to about 30x30 hex.

Kiseido is a nice go server. A lot of interesting stuff has been done with CGoban, and java web start is actually working okay now. Web start's proclivity to make applications look like they're subordinate to it as an architecture would make me not even consider using it though - that's so obnoxious as to be offensive.

Compare Joel's current time giving site design with the real thing.

Sylvester's problem is cool.

Surprisingly, the rules of go are somewhat controversial.

This game is rather interesting. I figured out at least one position in which O has a simple forced win. A more interesting question is whether there are any positions in which X has a forced win. I doubt it.

I tried out my original rules Bohemia on a 4x4 board. It's a close game, but the first player can always win. I didn't try any possibilities where the bohemian moves first and doesn't pass. Here is an interesting question - If we take a generalizations of this game, in which there are sets of subsets which the square wins if any of them become monochromatic, and the bohemian can pass, and the bohemian can go first, is there any such game in which the bohemian can always win, but loses by passing on the first move? Note that my new rules aren't of this form.

Joel on API Apology

Joel Spolsky has an article in which he states

All non-trivial abstractions, to some degree, are leaky.

This is overly dogmatic - for example, bignum classes are exactly the same regardless of the native integer multiplication. Ignoring that, this statement is essentially true, but rather inane and missing the point. Without abstractions, all our code would be completely interdependent and unmaintainable, and abstractions do a remarkable job of cleaning that up. It is a testament to the power of abstraction and how much we take it for granted that such a statement can be made at all, as if we always expected to be able to write large pieces of software in a maintainable manner.

There are really two separate statements here, (what does 'leaky' mean, anyway?) The first one is -

All APIs contain artifacts of what they're built on top of.

As I said before, this is overly dogmatic, but mostly true. The example given, however, is extremely curious. TCP requires quite a tangent to explain ('look how smart I am, I know about TCP') and is one of the most robust, useful, and widely deployed abstractions in existence. I've spent a huge amount of time using TCP in very novel ways and haven't set a socket option in my life. Furthermore, Joel cites TCP's failure mode as an artifact of IP, when it is in fact an artifact of wires and electricity, and an extremely well done and clean one at that. Whether the connection drops due to network outage or the counterparty machine going down, you still get a single well-defined failure, with clear semantics as to what might have arrived on the other end.

TCP actually has plenty of real artifacts, such as slow start and high latency during bulk transfers, but these would clarify the distinction between the statement about artifacts and the one Joel is really trying to prove, which is -

All APIs are Broken

As I explained, the TCP example doesn't prove this at all - if a failure mode is defined in the API, then that doesn't mean that the API breaks when it gets invoked. But this statement does apply to most of the other examples Joel goes on to cite - NFS is a mess, C++ strings aren't real strings, COM internals aren't properly hidden, ASP.NET doesn't have proper string mangling code to make GET parameters properly, etc.

You may notice that all the Microsoft tools are left to the end, right before the claim that the law of leaky abstractions is dragging us down. This essay isn't about API design, it's about software apology, an attempt to claim that all software is inherently bad hence it's okay that Microsoft's is bad. Like any good microsoftie, Joel starts with the assumption that the (very flawed) tools he uses are completely acceptable, and works backwards from there.

12 Nov 2002 (updated 12 Nov 2002 at 11:18 UTC) »
Protocol Stuff

The single sign-on protocol I gave earlier can be made secure when re-using challenge/response pairs by using HMAC instead of raw sha1 (HMAC is built using sha1). That greatly simplifies implementation on third party web sites and gets rid of a nasty denial of service attack.

Bohemia

I tried playing a few games of Bohemia and was a bit disappointed. There are a few specific things I'm going for with the rules -

  • The game should be even on a medium size board, ideally in the ten to twenty on a side range. The rules as I presented them lead to an about even game on a 5 by 5 board, way too small.

  • The beginning of the game should be very positional, gradually leading to a very tactical endgame. The rules I gave are a lot more like Renju, with lots of forcing lines.

  • There should be a complicated interplay between pieces of either color, with many double-edged moves. With all the forcing moves, the square winds up emphasizing one color and the bohemian mostly plays the other, and the bohemian's color's pieces don't generally make squares.

To try and fix those problems, I've come up with the following rule change - the square's new goal is to form a square whose corners are black on the lower left and upper right, and white on the upper left and lower right. This straightforwardly enforces a complicated interplay between both colored pieces, and since forming a clump of one color is very counterproductive, the square no longer has much benefit in playing lots of forcing moves. Also, the reduction in number of winning arrangements and greater complexity of having to play both colors probably makes the evenly matched board size larger.

Furthur experimentation is once again necessary.

If anyone can figure out a proof that with a sufficiently large board under this new rule the square can force a win, I'd like to hear it. I don't see one immediately.

tripp: Yes, the challenge/response pairs are for the purposes of (drastically) reducing the amount of communication between blog.example.com and signon.example.net

I avoided using public keys for several reasons. The biggest one is that if signon.example.net relied on knowing blog.example.com's public key, then if that public key were lost or compromised some very serious human intervention would be required at signon.example.net, while with the technique I gave blog.example.com would simply issue a new query to get a new list of challenge/response pairs. Getting a phone call from someone demanding that you manually change a key in a database, and having to somehow guess if they're who they claim they are or someone trying to commit fraud, is a problem you want to deal with never.

Also, the technique I gave is simpler. There's no key setup and management queries necessary at signon.example.net, and the only libraries blog.example.com needs are string mangling and sha1, which are already available on almost all web scripting systems. Also, any public key operations at signon.example.net can result in a significant amount of CPU load at not very high levels of scaling, doing the operations for https is bad enough.

I can't think of any disadvantages to the method I gave over using public key trickery - even the extra bandwidth it uses is paltry.

Squares

I figured out a cleaner proof of my earlier theorem about monochromatic squares, based on the Hales-Jewett theorem. I looked up the section on theorems proven using Hales-Jewett in a book I have and, sure enough, I re-invented what's known as Gallai's theorem.

The lower bound on lattice edge size is two to the power of Hales-Jewett with two colors and edge length four. This is still a huge value. At some point I'll get around to throwing blackbox at the problem to try to get a reasonable guess of the actual value.

single sign-on protocol

Raph mentioned that we discussed single sign-on protocols, and I came up with some good ideas. I'll summarize now.

There is a single monolithic site which users maintain their login information and identity on. This is very convenient from an end user perspective, since they only have to remember one password and maintain their information in one place. It's also very appealing for web sites, since they no longer have to manage any login information or do tech support for lost passwords and the like.

Say the site which does the signon is signon.com and the site using those logins is blog.com. When a user is doing something which may require login to blog.com, they'll get a hyperlink saying 'log in', which will link to signon.com. When the user clicks on that, they'll either be prompted to set up an account on signon.com, or if they already have one and have an active session, they'll get redirected back to blog.com, where they're now logged in.

That's the end user experience, now for the technical details.

blog.com makes an http or https query in advance to get a bunch of challenge/response pairs, which are generated dynamically by signon.com. A huge number of these can be stockpiled, so the query only has to be done rarely, the only real restriction on their use is that they can only be used once each. The login hyperlink goes to (or redirects to) a url on signon.com, and includes a challenge code and callback url as GET parameters. For example, it might be http://signon.com/signon?challenge=a63f34b34d&callback=http://blog.com/login.cgi signon.com then gets the users login from their cookie, verifies their identity (either by an active session cookie or making them type in their password again) and redirects back to the callback, giving it GET parameters of the user's login identity and sha1(login + response). For example, it might be http://blog.com/login.cgi?user=bram&verify=ssonuoocrdua blog.com can then easily check that verify is the correct value, and give the user a session cookie if so.

Raph suggested that for developer convenience it should be possible to send signon.com a nonce url which gets echoed back. I think this is more cleanly implemented by making signon.com notice if there are already GET parameters in the callback url, and preserving them if so. There is a special place in hell for whoever decided that the first GET parameter should have a different delimiter than the rest of them.

signon.com can avoid having to store all challenge/response pairs by keeping a secret symmetric key and generating challenges at random and setting responses to be the encryption of the challenge.

I'm very happy with this protocol. It's easy to implement on all sides, and has a smooth implementation path - web sites are immediately motivated to use it because it removes the burden of account management from them. Also, blog.com has complete control over whether all requests are done by http or https, allowing them to determine their level of security pragmatically and migrate over time. signon.com doesn't even have to configure anything new to support new sites - it's simply left running. Best of all, sites using signon.com can't compromise each other no matter how insecure they are.

The main problem with this whole scheme, of course, is that it's centralized, has a single point of failure, and generally isn't very cypherpunkly. That's explicitly the model Raph asked for, out of overwhelming engineering expediency, and I agree with that sentiment - this solution is far from perfect, but it's easy, and noone's demonstrated that decentralized single signon is workable.

Bohemia

Bohemia's board size problem can be fixed with an interesting version of the swap rule - one player decides the board size, and the other decides whether they want to be the square or the bohemian. Handicaps can be given with a spread on the board size. It's possible that an even chances board size is overly large or small, experimentation is necessary.

One dimensional Bohemia is possible as well, by having the square go for arithmetic progressions. This can be done with any number of colors and length of arithmetic progression. The small van der Waerden numbers indicate that length three and two or three colors might be a bit small, but more is probably very interesting. This has the distinction of being one of the few games on a one dimensional board which isn't completely lame.

Debian

BitTorrent (which had a new release today) has the necessary files to make debs, but isn't in debian. If any debian maintainer would like to volunteer, I'd very much appreciate it. Maintenance should be easy - the dependencies are straightforward and it's written in pure Python.

3 Nov 2002 (updated 4 Nov 2002 at 07:17 UTC) »
garym: Your description of the current state of the software industry is, unfortunately, quite accurate, and it won't get much better any time soon. The problem is that only programmers can really tell whether other programmers are any good, and we aren't going to be put in charge of the industry any time soon.

I've achieved a modest degree of success as an open source developer, but my advice for those considering it as a career is, only do it if you have to out of some immovable internal need. We are only just beginning to see the phenomenon of starving artist programmers, and it will only get more pronounced in the future, possibly even becoming standard for the industry.

I've seriously considered attempting a career as a screen writer. Although I have no experience with it, I have a good work ethic, natural talent, and possibly most important, a willingness to write high concept work with happy endings. Unfortunately those attributes don't even vaguely gaurantee a decent living in that industry, so I've shelved any such plans until I either have enough money to not be worried about failure or have come to hate the computer industry so much that I just want out.

My current long term career plan is to get good enough at financial stuff that I can start a hedge fund or set strategies for one. Unfortunately there's a lot of luck and who you know involved in that industry, but at least it consistently pays well and, unlike screenwriting, the industry has a desperate need for real talent.

For now, I'm doing okay working on networking software, which unlike any of the above I have oodles of experience in. My other career option which involves something I'm already good at is juggling, but I find that a less appealing career path than any of the above.

I haven't even dreamt of getting payed for all the random weirdness I like posting to my advogato diary. That's only possible with a tenured university position, and the more time passes the farther away I seem to be from academia.

Bohemia

Here are the rules to an abstract board game I came up with called Bohemia -

The two players are the square and the bohemian, the square moves first.

Players alternate playing either white or black pieces on an n by n grid. Either player may place either color piece. The bohemian is allowed to pass.

If four pieces of all the same color get placed such that they form the corners of a square with horizontal and vertical sides, the square player wins. If the board gets filled up without that happening, the bohemian wins.

I'm not sure what size board gives even chances to both players. That should be determined through experimentation.

A conjecture strongly related to this game is that for a sufficiently large board size the square must always win regardless of how the pieces are placed. This can be generalized to more than two colors. Also to lattices of side k, for which a square is a special example with k=2. This is a two-dimensional version of van der Waerden's theorem, which can of course be further generalized to higher numbers of dimensions.

I can't even prove this for two colors, two dimensions, and lattices of size two. It seems likely that such a straightforward problem in ramsey theory has already been studied though.

Update: I figured out the following very nice proof of the simplest case of my conjecture.

First, we will demonstrate that for any number of colors, a sufficiently large lattice will contain three pointts of the form (a, b), (a + c, b) and (a, b + c) which are all the same color. I'll call these formations Ls.

By van der Waerden's theorem, a sufficiently large 2-colored lattice will contain an arithmetic progression of length three somewhere along its bottom row. Let us say that those points are at (a, 0), (a + c, 0) and (a + 2*c, 0). The point (a, c) must be the opposite color to avoid forming an L with (a, 0) and (a + c, 0). Likewise the point (a + c, c) must be the opposite color from (a, c) because of (a + c, 0) and (a + 2*c, 0). Now we can show by contradiction that there most be some monochromatic L - the point (a, 2*c) must either form an L with (a, c) and (a + c, c) or with (a, 0) and (a + 2*c, 0). Therefore for a sufficiently large 2-colored lattice there exist points (a, b), (a + c, b) and (a, b + c) which are all the same color.

We will now demonstrate that this is true for any number of colors by induction. Say we have shown for k colors that a lattice of size f suffices. By van der Waerden's theorem, a sufficiently large lattice colored with k+1 colors will have a monochromatic arithmetic progression of length 2*f along its bottom row, say that this is (a + x * c, 0) for 0 <= x < 2*f. Now consider the lattice formed by (a + p * c, c * (q + 1)) for 0 <= p < f and 0 <= q < f. Either some node in this lattice will be the same color as our arithmetic progression on the bottom row, in which case it forms an L with that, or it contains only k colors and hence contains an L by induction.

Now we will use the above lemma to prove our theorem. Say the size of a 2-colored lattice so that it must contain an L is g. Now make a sufficiently large lattice that if it's colored with 2 ** (g ** 2) colors, there must be an L, and substitute a square lattice with side length g for each of these, with colors corresponding to 2-color assignments within the sub-lattices. Call the offset between the L of squares which all contain the same pattern d. The lattice of offset d in both directions from the lower left must contain the exact opposite of every corresponding color or by construction form a square, so we will consider the case where it's the opposite. Since it's of size g, there must be an L in the lower left one of these, whose coordinates we will call (a, b), (a + c, b) and (a, b + c). It is now straightforward to show that by construction the points (a, b), (a + d + c, b), (a, b + d + c) and (a + d + c, b + d + c) are all the same color, hence there must be a monochromatic square.

This proof is so straightforward that I'd be extremely surprised if it isn't already known, but it's pretty and non-trivial enough that I'm happy to have come up with it.

Unfortunately the lower bound it gives is a little bit too big to be practical.

39 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!