16 Aug 2002 Bram   » (Master)

Roshambo

Today I'll be writing about computer strategies for the classic game Roshambo, also known as paper scissors stone.

But ... but ... There's no strategy in Roshambo!

While it is true that playing random can reliably do dead average, in any tournament everyone will be trying to win, hence everyone will have bias, and strategies which do better than average against a wide range of opponents will rise to the top.

I've spent far more time than is reasonable reading about the computer Roshambo tournaments, and will present a distilled explanation of the strategies here.

Henny

Henny exemplifies the simplest strategy which works well in practice. It picks a random move which the opponent played previously, and plays the response which beats it. If the opponent has any bias toward playing the same move over time, Henny will win. Henny is also very defensive - optimal play against it will only get an edge very slowly over time, and even that might get swallowed by noise.

Markov Chains

Another winning in practice strategy is to use markov chains - look at the last n moves your opponent made, and predict that they'll continue playing as they have most frequently in the past. Considerable variation is possible in length of chains, which brings us to...

Iocaine Powder

The competitor Iocaine Powder magically combines the best qualities of Henny and markov chaining by looking for the longest match between the moves your opponent just played and any sequence they've played in the past, and assuming they'll play the same next move again. If the opponent has a bias based on the last n moves for an n, this strategy will essentially pick a random time they played the same last n moves they just did, which is the defensive strategy used by Henny.

Ironically, this algorithm is completely deterministic.

Note that it's slightly better when there are equal length matches to go with the first matching sequence rather than the last matching sequence, to perform well when an opponent plays a string of paper, then a scissors, then a string of paper again.

It's possible to implement this technique very efficiently using trees, which strangely Iocaine Powder doesn't do (MemBot does, though).

But this strategy still has a major weakness, and Iocaine Powder has another trick up its sleeve...

Sicilian Reasoning

A braindead, fixed sequence entry named DeBruijnize nearly cleaned up in the first tournament. DeBruijnize sequences are biased against sequences they've done in the past, rather than in favor of them. A meta-strategy called sicilian reasoning beats it nicely. Compute how well you would have done to date predicting the opponents move straightforwardly, then always shifting it up one (paper->scissors, scissors->rock, rock->paper), then down one, then the same three predicting instead your own side. Play whichever one would have the highest score so far.

Sicilian reasoning cleanly defeats not only DeBruijnize, but all manner of cheap variants on it.

Sword and Shield

This is my own idea, which is untested, but seems reasonable.

Generate predictions via whatever method (pre-sicilian reasoning) for both your move and the opponent's move, then play a move based on the following table. The top row is your predicted move, the left row is the opponent's (t = stone) -

        p  s  t 
      ---------
     |
   p |  s  p  s
     |
   s |  t  t  s
     |
   t |  t  p  p

This algorithm plays defensively by predicting both sides simultaneously, and hence may beat any opponent which bases each specific move on a prediction for only one side. It has nine sicilian reasoning variants, found by offseting the prediction for either side one of the three possible ways.

Wimping Out

A tempting strategy is to make your bot 'wimp out' and start playing randomly if it isn't doing well. Tournaments play two programs against each other many times with no persistent information between runs to keep this strategy from being effective.

Optimizing for Score

The winning entry, Greenberg, looks at historical performance if its history buffer was various sizes, and has exponential falloff of their performance. This makes it score very well against weaker opponents, but makes it more vulnerable to better opponents. Score, match wins, and worst-case lossage are all subtly different things to optimize for.

The Future

Sadly, there appear to be no plans for future computer Roshambo tournaments. It would be interesting to see how strategies continue to evolve.

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!