Older blog entries for schoen (starting at number 225)


I went to Crackmonkey Night at Zeitgeist yesterday. There Mr. Bad made me a name tag which says "Demandu al mi pri la lingvo internacia". I was pleased.

I'm going to Target tomorrow to buy a fire extinguisher.

If you don't have a fire extinguisher, you too should go to Target and buy a fire extinguisher. (Or you can get one somewhere else.)


I'm enjoying my function "unshuffle"; in its present form, it seems to work (trivially identifying which card has been displaced from its order in a casually riffle-shuffled deck) about 60% of the time after three simulated riffles interleaved (heh) with arbitrary cuts. A big question is still how accurate those simulated shuffles are.

I think I can add another 20% to that, though. About half the time when it fails to identify the card, it's not because it's wrong, but just because it finds many possibilities. (There are a lot of possible cases, which were one subject of my long conversation in Israel mentioned in a previous diary entry. One possibility is mistakenly identifying some other card as displaced. Another possibility is an inability to decide which card has been displaced, from among several possibilities. Another possibility is not finding any cards which appear to have been displaced.)

Ides of March

I didn't beware them last year. Does that admonition apply only to Caesar, or to the general public, too?



Assuming even shuffling, how many shuffles on average would return a deck to it's original order?

If by "even shuffling", you mean

  • perfect faro shuffling: 30, if you're consistent about always in-shuffling or out-shuffling, but you don't care which.
  • shuffling well, but not in precisely the same way every time: probably 52!/2, because good shuffling actually has a genuine randomizing effect, and correlations and patterns will tend to disappear; you'll come around to the original order only by chance.
  • shuffling well, always breaking the deck 26 and 26: probably still 52!/2.
  • shuffling in a randomly-chosen but consistent way: this is a very good question, maybe harder than my old question about permutations; the answer is probably less than 90090.

jmg, thanks for your suggestions. I think we can assume that runs always contain at least one card, by definition; if a shuffler misses an attempted run, it just makes the older run continue longer (although I guess the idea is that the probabilities should be different: there's a higher probability of some much longer runs in that case).

Your point about the difference between the (remaining?) size of the two halves is something I actually addressed yesterday, before I read your note. I don't have a good solutions to it. Do people compensate to try to keep the absolute sizes of the halves equal? Or are they trying to keep the rates at which the halves are exhausted equal? These lead to very different models!

I've been thinking that people are trying to keep the rates even, which doesn't require any adjustment when they notice that the halves have different numbers of cards (because they don't care about that). Your suggestion seems not to be that the average run length should be larger from the side which has more cards, but rather that, when the shuffler notices a significant discrepancy, he will probably try to correct for it by causing a single long run then and there, to even out the two halves. Is that correct?

How much control do shufflers normally exert over the rates at which they let cards full? Not how much control they are capable of (for some extremely adept shufflers, that's perfect control!), but how much control they actually bring to bear after they've started to riffle the cards. In my case, as a poor shuffler, I find it very hard to slow down at all once the cards have begun to fall.

Applications? (A story about a conversation about a card trick)

So, way back in 1995 I was once talking with this physics student about a card trick in the lounge of a college dorm (which happened to be in Haifa, Israel). When we had talked about it for twenty minutes, a friend wandered by and said "You're still talking about that card trick?" "Yep", we said. They said "Well, we're going to dinner."

When they came back from dinner, we were still sitting at that same table. "You're not still talking about that card trick?" But the pad of paper with sketches of hypothetical patterns of shuffles interleaving told the tale. "You're actually still talking about it. OK, we're going off to do some work." So they went off and studied a bit.

Of course, we were unperturbed, and continued our conversation, so that when those people came by again and said "We're on our way out to a movie -- I bet you'll still be talking about that card trick when we get back", we just kind of nodded sagely. And off they went to a movie, and we kept on scribbling and arguing about runs and sequences and ambiguities and averages.

When the movie ended, of course, we were discovered not to have moved an inch, although we had consumed several more sheets of paper, which were scattered around the table. They talked about chains and immunity to cuts and about cyclic order and cyclic chaining and abitrary cut-points and probably something like "true effective cyclic chain length". The deck of cards we'd had was stacked into eight neat little piles, and we were still trying to make consensus estimates of some probabilities.

"You're still talking about that card trick! Look, everybody, Seth and Uri have been here since before dinner talking about a card trick." And it was true. Four hours, all told, if I remember correctly, and four very pleasant hours they were, too.

So there actually is a connection to the program I'm writing. The card trick in question is one I've already mentioned in my Advogato diary (July 17 of last summer); Knuth actually mentions it in an exercise in The Art of Computer Programming. The basic application is that, if I can actually get a realistic model of how people shuffle, I can find out how likely there is to be exploitable non-randomness in a deck after a small number of shuffles.

You know, it's kind of like TCP sequence prediction attacks, only more useless.

Anyway, I'm working on a simulation that tries to undo small numbers of shuffles.


Happy pi day.

IBM Linux ads

IBM's ad agency seems to really, really want me to be in an ad for Linux. It's tempting. I think they may do it right, in the sense of "honestly and respectfully": the ad agency reps seem to accept that free software is different from proprietary software and to consider it a good thing.

I hope they're not harping on the "counterculture" note too hard; in a sense, it's a very difficult problem. I read Commodify Your Dissent by Thomas Frank and found it very challenging; one might have predicted from the discussion of IBM there that IBM would soon be running ads about subversive, revolutionary programmers (and Frank would probably say that this is shallow or counterproductive, because IBM has zero permanent alliance with the values of free software or even with the development model).

There are conflicting tendencies. On the one hand:

  • Free software really is different; the differences are not trivial or vacuous.
  • Free software really is "alternative"; people really have a choice.
  • The non-free status quo really is problematic and it's not worth emulating (in the sense of "seeking to imitate", not in the sense of the Wine Project).
  • There really is a culture around free software which is interesting, genuine (it seems to me), and worth expressing.

On the other hand:

  • There is no point in gratuitously highlighting differences just for shock or curiosity value.
  • Free software is not guaranteed to be successful in business or in public interest just because it's different or free software community members are different in some way.
  • Doing free software is a normal thing to do. You don't have to be weird or bizarre to participate. It's not some special thing reserved for an elite or for a group of strange people -- or, to the extent that it is, anybody can join that group.

Celebrity counterculture (as opposed to general awareness, understanding, and respect) is not the best thing that can happen to free software. And I'm a little concerned that the IBM ads may be heading in that direction.

Remember: "It is not in heaven, that you should say, 'Who will go up for us to heaven, and bring it to us, that we may hear it and do it?'" Neither is it beyond the sea, that you should say, 'Who will go over the sea for us, and bring it to us, that we may hear it and do it?' (Deuteronomy 30:12-3 (RSV))


I got one. No beard cutting as yet.


Jamie's interpretation of a story about fortune, "Johnny Mnemonic", and the police:

You will be miserable until you learn the difference between scalar and list context.


I wrote a letter to the Chronicle which they didn't print but which somehow ended up in the hands of Richard Stallman. He liked it; it owes a lot to his essay Re-evaluating Copyright: The Public Must Prevail.


I didn't make much progress on my Python card-shuffling code, but I looked at some output, and it looked pretty realistic. There are even discernible effects from the fact that people don't cut a deck exactly half and half: this means that there will probably be an extra-long run from one half either at the top or at the bottom. (There's also another model possible in which the average length of the individual runs from one half is longer than the average length from the other. I'm not sure which is more physically plausible. I could be very brave and write to Persi Diaconis and S. Brent Morris and ask: I think one of them might know offhand.)


Sorry to mention too much stuff from slashdot here in one day. (I had independent sources for the other two, I promise!) Does anyone have experience with VoiceXML? I looked at Eve Andersson's article about VXML and was pretty impressed; I called up TellMe -- and you can too -- and listened to her program read me 100 digits of pi when I asked it. Come on, try it out, in honor of pi day. ("Call 1-800-555-TELL. At the main menu, speak the word 'Extensions.' Enter extension 58874.")

But anyway, I signed up for a developer account at TellMe Studio and quickly realized that

  • this is pretty powerful technology
  • I don't actually know anything about VXML

but I now have a little application that will greet me in my own voice and then synthesize the comment that "This is cool". On the phone, over an 800 number. So conceivably I could not only have a web page which can tell people what the weather is like here, or whether my coffee pot is on or off (if I actually had a coffee pot), but I could actually have an 800 number people could call and, in my own voice, it could tell them those things.

It seems that you have to pay to make your VoiceXML applications available to the public; I guess that's only fair.

On the Python edu-sig mailing list, there was some nice discussion about a class to represent a deck of cards, with some code which I got to contribute a bit to.

I managed to write somewhat simple in_shuffle() and out_shuffle() functions which perform (virtual) perfect faro shuffles on a deck. Using these, it's easy to watch what these shuffles do, and to verify experimentally that "eight out-shuffles or fifty-two in-shuffles will return a deck [of fifty-two cards] to its original order" (quoting Gardner from memory). So that's fun.

What I really want, as I wrote on edu-sig, is a good and realistic model for how most people -- you know, other than accomplished magicians and professional gamblers -- riffle shuffle. This is very tricky. I have written a function called casual_shuffle() which it would probably be easier to quote than to describe. (Source code is great for concise expression.) But:

I assume that in a casual riffle shuffle, a person is equally likely for any n between 0 and 52/X to divide the deck into "halves" of 26-n and 26+n. So if X=10, dividing the deck at 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, and 31 are considered equiprobable. (It should probably be a Gaussian distribution instead.)

Now cards are dealt in "runs" from alternate halves, starting with a random half. The length of a run is potentially limited only by the size of the half of the deck from which cards in the run are being dropped, but in my model it is less and less likely that a run will continue as the run gets longer and longer -- the incidence of longer runs should be less than the incidence of shorter runs. That part is correct -- but beyond that I don't know what's realistic, and so I have a decay model, where each successive card in a run diminishes (by a certain ratio) the probability that the run will continue. It is possible to choose the decay ratio.

OK, here's the actual casual_shuffle() function:

  def casual_shuffle(self, split_proportion=10, run_decay=0.8):
       # Simulate the riffle-shuffling pattern of a normal human, who
       # doesn't know how to do a perfect faro shuffle.
       L = len(self.cards)/2
       margin = L/split_proportion
       splitpoint = random.choice(range(L-margin, L+margin+1))
       a = self.cards[:splitpoint]
       b = self.cards[splitpoint:]
       c = []
       source = random.choice([a, b])
       while a or b:
          p = 1.0
          while source:
             if random.random() >= p:
                print p, " exceeded."
             p = p * run_decay
          if source is b:
             source = a
             source = b
       self.cards = c

It turns out that I was right nearby stephane on Sunday, in that she went by me on a train.

Dover Publications now has their catalogue on-line. Yay!

I hurt my arm a bit again. On average, that's going better than a week or two ago.


The conciseness and power of efdtt.c by Charles Hannum is very remarkable. (Take a look in the Gallery.) We're right to be impressed that the ancient epic poets wrote in hexameters, but also at the expressive skills of some of our modern programmers. I hope the author won't be offended that I think efdtt ought to be entered in the IOCCC.

Noe Valley and back again

I went off to track down two bookstores which turned out to be in Noe Valley -- Cover to Cover and Phoenix. I bought some interesting stuff in each. I was a bit amazed at how close by Noe Valley is to where I live now; I think I could walk to my cousin's place there in about a half an hour. (I had one of those "oh, this is that street corner!" experiences. Those are fun.)

A soda fountain called the Fountain of Youth out there sells a t-shirt which says something like "If you look at the back of my t-shirt, you'll see a good reason to go to Church every day"; on the back is the logo of the Fountain of Youth with its address on Church Street. Aha.

In other news

I read Contact by Carl Sagan on Sunday. (I bought it at Phoenix in the early afternoon, and saved my wrists a lot of stress by turning pages all through the early evening instead of typing.) In many ways, it was very good -- some parts were extremely absorbing, some parts were very thought-provoking. In other ways, it was a bit weak: some things were a little heavy-handed or implausible.

In a few places, I was very moved, but in other places, I wanted to shout "cliche!". It's extremely amusing that the book contains the line

The book was better than the movie.

I think I gave myself a stomachache by eating an entire container of soy ice cream in one sitting. I'm great at too much of a good thing.

SoFar by Andrew Plotkin is a very remarkable game; I wish I had the patience, skill, and now the wrist strength to finish it without a walkthrough. (I never win interactive fiction games, but I respect and admire them. I may have mentioned that Photopia made me cry.)


Happy birthday, mbp.

Long-time readers of my diary might be interested to see that the rule for converting binary to Gray code appears in The Armchair Universe by A. K. Dewdney. The rule for converting the other way doesn't.

Zack wants me to make one of Raymond Smullyan's number machines (which teach you about automata theory!) from chips. The trouble I have is that most of the number machines don't have any bound on the potential length of the input or output, and it seems that you have to read the entire input before you can give any input. Of course, that's pretty difficult to do in TTL -- it requires a RAM and a fairly elaborate state machine. I don't think I have any kind of RAM that I can readily use for this purpose, although that gives me yet another thing I can buy at an electronics store. (My understanding is that SIMMs are much too complicated, although maybe if I had some datasheets I could figure them out.)

I guess I'll go look at some bookstores and hope to do a little cleaning or electronics stuff in the evening.


I went to a CABAL installfest for the first time in a month or two and to a CABAL meeting for the first time in many months. I met some interesting people. There were two women there from an advertising agency which is preparing some ads for Linux and free software in general on behalf of IBM. (It's already been reported that IBM is considering a major ad campaign to promote Linux; it's true, and some of the people involved were there at the Cow Palace asking about it. Perhaps some of you reading this will be asked if you want to participate. From my conversations there, I think they could definitely do it right.)

Making sense of history
And drawing warmth out of the cold

(Dar Williams, "The Christians and the Pagans")


I had a great game of chess with Zack.

Won't somebody think of the children?

I went to the National Academy of Sciences committee meeting in Redwood City, where I heard Professor Lessig talking about an Internet content label scheme he's proposed. I gave him a bunch of technical questions afterward, but I've saved the tough ones for later.

There were various amusing things there, like hearing a man complain about Peacefire while right beside me. (I decided to keep my Peacefire shirt in my bag rather than wearing it, but I signed in to the meeting attendee roster giving my peacefire.org e-mail address.) I didn't rise up in defense of the organization or anything -- just listened to him and thought about technical issues.

It has single-point-of-failure and traffic analysis problems which ZKS Freedom and cypherpunk remailers mitigate or avoid -- but SafeWeb, recommended on peacefire-technical recently, looks very immediately useful. It's a generic anonymizing web proxy which you connect to using HTTPS. (OK, Anonymizer.com does that too, but they charge you for the HTTPS part and SafeWeb doesn't.) So if you don't mind the traffic analysis bit and you don't distrust SafeWeb, this is a very good tool both for anonymity and for circumventing censorship.

I actually had a very interesting conversation with Lessig, but I've already typed up an account of that, and I'm not going to repeat it here.

In other news

I'm thinking of making a weekend trip and announcing it. I have a whole list of a dozen or two possible weekend trips which I can start to make an invite people on.

My arms still hurt -- of course they wouldn't stop just because a doctor told me that I didn't have permanent injuries!

You got
What you want
Now you can hardly stand it

(Aimee Mann, "Wise Up")

216 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!