Rock, Paper, Azure: The Softwire Bot
25 August 2011, by Chris Harris
Microsoft recently ran a coding competition called Rock Paper Azure to publicise their new cloud platform. The challenge was to produce a bot able to win at a slightly modified game of Rock Paper Scissors (you also had a limited number of “Dynamites” to play, which beat everything, and as many “Water Balloons” as you liked, which only beat Dynamite).
A couple of Softwire employees wrote bots in their spare time, and one of us – Matthew Richards – ended up in the prize-money positions. In fact, he was the highest-placed Brit, and finished fifth overall!
I asked Matthew how he managed it.
Congratulations on your 5th-place finish!
Thanks! It was a bit of a surprise to be honest – it felt like I’d been heading steadily down the leader board for the final week or so of the tournament. But Rock Paper Scissors always involves a certain element of luck (even the modified version involved in this competition) – so I suppose either I got luck at the last minute, or had been getting unlucky for the previous few days!
How did you get involved in Rock Paper Azure?
One of my Softwire colleagues mentioned it on our internal blog. There was a suggestion we could get together one weekend and spend a day working on bots together – the big coding event didn’t happen in the end, but having several other people to talk to and discuss ideas with was really useful.
Take us through the game theory aspect of the game – i.e. if you were playing as a human, what strategy would you employ, and why?
Ignoring the Dynamite and Water Balloon, the theory goes that you need to play randomly to be unbeatable (on average). As a human, that’s probably the most challenging aspect. In fact we did a practical on this in the pub at one point, followed up by some statistical analysis that I didn’t try to follow too closely – the conclusion was that someone who thought they were being pretty random, wasn’t!
Introducing the new moves, the first question to answer is when should you use your dynamite? It’s a limited resource, so the obvious answer is “when it’s particularly valuable to do so”. In Rock Paper Azure, points get accumulated if there’s a tie so the time to roll out the dynamite is when there have been a few draws in a row and there are lots of points to play for.
Except that then your opponent will guess what you’re about to do, and will play Water Balloon. Which is the other key challenge – predicting when your opponent will use the Dynamite.
As a human, I would try and fail to analyse my opponent’s facial tics, and do very badly…
Apart from not being tempted to psycho-analyse its opponents, what did your bot do that you couldn’t do yourself?
In summary – calculating the odds, and remembering what’s already happened. Two things that I could do manually, but taking half an hour and scribbling stuff on a pad of paper probably wouldn’t be deemed acceptable.
My algorithm basically boiled down into three components. Firstly, some logic to decide when we’d like to use Dynamite. This is based on the principle I outlined before – the more ties there have been, the more valuable the dynamite. I basically calculated how many ties, double ties, triple ties etc there were likely to be in the remainder of the game, and compared that to the number of sticks of dynamite I had left to decide whether the current move was valuable enough to use my dynamite on.
Also, to avoid being too predictable, I only used dynamite about a third of the time that I thought it was worth it. I think the 1/3rd figure is quite important, because it maximises my own opportunity to win points, while making it expensive for my opponent to water balloon me on the off-chance that I use the dynamite. In the end I didn’t use dynamite more often if there were more points at stake – which probably confused the opposition a bit!
The second step was predicting what my opponent would do. I kept a history of “number of ties => corresponding move”, and various other combinations of historical factors, which I weighted to predict the next move. And then based on that prediction, went for an appropriate counter-move designed to maximise my expected score. This meant that some of the time I would end up not using dynamite even if I wanted to, but the first part of the algorithm was sufficiently self-correcting to ensure I didn’t end up wasting all my dynamite as a result. Obviously this wouldn’t beat a bot that behaved randomly, but unlike “pure” Rock Paper Scissors you can’t just behave randomly and win.
Finally I ended up running a number of different variations on the above algorithm in parallel, picking the one that would have achieved the highest score over the last few hundred turns. This is a level of complexity I certainly couldn’t have achieved were I playing as a human, even if I had pen and paper available! But I think it bumped me up a few places in the rankings, by enlarging the number of different bots I could beat.
Did you employ any nifty coding techniques to implement your strategy?
I used C# and made a certain amount of use of generics and LINQ. It was a pleasant change from my last attempt at a similar competition (last year’s Google AI Challenge), where I was restricted to .NET 2.
The main thing I learnt wasn’t a language feature though, but Visual Studio 2010’s much improved performance profiling tools. Running several copies of my bot in parallel proved rather slow at first, and ran up against the competition’s time limit. However running the performance profiler over my code rapidly identified some quick wins – half an hour’s refactoring improved my code’s performance fourfold! (And I don’t think it’s just that I’d implemented it unusually poorly in the first place…)
Did you watch the Grand Final live? Was it fun/interesting/dramatic?
I didn’t – it was past my bed time! I did fast-forward through the recording afterwards, and on balance I think it’s a good thing I didn’t stay up. No disrespect to those running the webcast, but there’s a limit to how much you can say about a web page with a list of bot names on it! Even if you refresh it a couple of times as the finals progress…
What will you spend the prize money on?
I think I need to take my wife out to dinner, to make up for all the time I spent writing my bot!