Google DeepMind's Self-Learning Computer!

Oct 31, 2017 By Christine H, Guest Writer
Anonymous's picture

Can you imagine a smart computer that can figure out how to play a game by itself?

Google recently announced that the latest version of its AlphaGo program is now capable of mastering the game without any help from humans! 

This version, called AlphaGo Zero, only had knowledge of the rules for playing the Chinese game of Go.  Within three days, AlphaGo Zero became good enough to beat previous versions by 100 games to 0. We take a look at why this is a significant breakthrough.

How Difficult Is Go?

Artificial Intelligence (AI) is the science of making computers do things that require human intelligence (see Side Notes). In 1997, IBM’s Deep Blue chess-playing machine beat a reigning world champion, Gary Kasparov. 

Programs such as Deep Blue would examine all possible choices and its outcome, in order to predict the next move. This does not work for Go because it is much more complicated than chess and has a huge number of possibilities. For example, after the first two moves in chess, there remain about 400 possible next moves. But in Go, there remain about 130,000 possible next moves. 

A researcher at Google described the search space of Go to be more than the number of atoms in the universe!  “Search space” is the computer science term for the number of possibilities that have to be examined or searched through. Because of the complexity of Go, many researchers, including those at Google, know that mastering it would be a breakthrough for AI.

AlphaGo and AlphaGo Zero

AlphaGo was designed to predict the next move based on data from millions of moves by human experts. Next, AlphaGo was improved to allow it to learn new strategies for itself. It used complex algorithms and neural network technology (that mimics neurons in the human brain) to make the best decision.

AlphaGo became the first computer program to beat human experts in Go – first, the European champion Fan Hui in October 2015, and most recently, the world’s best Go player, Ke Jie from China, in May 2017. For these competitions, AlphaGo still required huge amounts of data from previous games as well as computing power.

With AlphaGo Zero, the researchers at Google decided to only teach the program the rules of the game, and let it learn the strategies by playing against itself millions of times. This enabled them to create a simpler program that required less processing power. By learning through losing or winning a game, AlphaGo Zero was able to improve its strategies, progressing from amateur to expert level in days!

This is a huge milestone, and scientists hope to apply similar algorithms to other fields of science and medicine, as such designing new drugs or materials.