Al Research the Games: Deepmind Beats Nearly AllComers
When DeepMind, Google’s AI research outfit, set out to demonstrate its latest breakthrough, it had to confront an added twist: how do you set your robot free to play games on the internet without anyone realising they’re competing against it?
The company caused a stir when it announced that its AlphaGo AI had beaten a world-class player at the ancient Asian board game Go. A few months later, it beat the world number one player.
But for the deeply strategic real-time war game StarCraft II, it had a different goal: to reach “grandmaster” standard – putting it in the top 200 players worldwide – on the game’s public servers, building its ranking the same way any human player would. That meant being matched with a steadily improving cadre of other human players, and winning against them consistently enough to be promoted.
StarCraft may seem like an odd next step, for a team that has previously taken on chess and Go, but the game has some qualities that make it interesting to researchers. It’s real-time, with millions of possible actions each second, and a vastly more complex roster than the six pieces of chess. Most importantly, it features hidden information: for the first few minutes of the game, it’s impossible to even see what your opponent is doing, let alone work out what they’re planning.
That means strategies have to be flexible enough to account for surprises, and need to incorporate mind-games as well. There’s also an advantage in a community where even the best players in the world can be found playing each other online, ranked according to a very public algorithm, with a ton of data flying around.
Players were told the new AI, dubbed AlphaStar, would be online, and were given the option of opting-in to play it. In order to ensure it achieved its rank fairly, it had to play its games anonymously, so that opponents didn’t spend more effort trying to trick it or break it than they did trying to win.
“There was a bit of a meme where people started asking ‘are you AlphaStar’ to others,” said DeepMind’s David Silver, one of the company’s co-founders and a lead author on the Nature paper announcing the StarCraft II victory. “We had the policy to just not chat – other than wishing people good luck, and then ‘good game’.”
The need to remain anonymous did also turn the experience from a test of raw skill into a sort of “Turing test for video games”, said Silver’s colleague Oriol Vinyals. “AlphaStar needed to play like a good human, not like a superhuman.”
That meant taking a different approach from previous StarCraft AIs, which tended to lean on the abilities that only a computer could have. In a game where human competitors track their “actions per minute”, a professional-level player may hit three or four hundred, while some AIs were acting thousands or tens of thousands of times over a sixty-second period. At other times, AIs were given near omniscience, with all the information available over the entire map plugged into their systems at once.
“We really wanted to have an interface that we believe was reasonable from a capability standpoint,” says Vinyals. “So we added this notion of a camera view, which is very crucial for players to control where in the map they’re actually focusing on, and we also reduced the peak actions per minute, to 22 actions in a span of five seconds.” In other words, the AI is forced to play much more like a human.
All of which is moot if the AI gives itself away by, well, playing like a robot. Luckily, it doesn’t – quite. In the first series of matches played publicly, in January, AlphaStar did exhibit one slightly mechanistic behaviour, falling prey to an almost cartoonish tactic where its opponent, the human player MaNa, moved a unit into and out of its field of view, changing its behaviour each time. It worked for MaNa to eke out the only win the humans scored over those first 11 matches.
More interestingly, the AI did develop its own understanding of the best tactical play, occasionally differing from the generally accepted practice among pros. The intricacies are a bit specialist, but reinforce the idea that simply teaching an AI to perform a task to human level can improve our understanding of the work itself.
“AlphaStar has been an amazing experience,” Oriol says. “Not because we beat most humans. But it’s more like that we were able to see what some limitations might be, to inspire research that will come, hopefully in the next few months or years and decades. Picking harder and harder problems and trying to be very good at them has been clearly the way so far.”
Ian McCawley is a Contributing Editor at The National Digest based in the United Kingdom. You can reach him at firstname.lastname@example.org.