Advertisement

Google's DeepMind AI gets a few new tricks to learn faster

It can now achieve 87 percent of "expert human performance" in one game.

AP Photo/Lee Jin-man

When it comes to machine learning, every performance gain is worth a bit of celebration. That's particularly true for Google's DeepMind division, which has already proven itself by beating a Go world champion, mimicking human speech and cutting down their server power bills. Now, the team has unveiled new "reinforcement learning" methods to speed up how the AI platform trains itself without being directly taught.

First off, DeepMind's learning agent has a better grasp of controlling pixels on the screen. Google notes it's "similar to how a baby might learn to control their hands by moving them and observing the movements." By doing this, it can figure out the best way to get high scores and play games more efficiently. Additionally, the agent can now figure out rewards from a game based on past performance. "By learning on rewarding histories much more frequently, the agent can discover visual features predictive of reward much faster," Google says. The company laid out the entire concept for the abilities in a paper, "Reinforcement Learning with Unsupervised Auxiliary Tasks."

These skills, along with DeepMind's previous Deep Reinforcement Learning methods, make up the group's new UNREAL (UNsupervised REinforcement and Auxiliary Learning) agent. That's a mouthful, but the big takeaway is that DeepMind is beginning to teach itself much like humans. The group describes the methods as being similar to the way animals dream about positive and negative events (though I wouldn't really say DeepMind has learned how to "dream").

In a 3D maze environment called Labryinth, Google says the UNREAL agent was able to learn stages around ten times faster. It has managed to achieve 87 percent of "expert human performance" in that game, and around nine times typical human performance in a bevy of Atari titles.

On the face of it, UNREAL should help DeepMind's agents significantly. But we'll have to wait and see if those performance gains can actually be used in scenarios beyond games.