back to article How DeepMind's AlphaGo Zero learned all by itself to trash world champ AI AlphaGo

DeepMind published a paper today describing AlphaGo Zero – a leaner and meaner version of AlphaGo, the artificially intelligent program that crushed professional Go players. Go was considered a difficult game for computers to master because, besides being complex, the number of possible moves – more than chess at 10170 – is …

  1. Rebel Science

    Very impressive indeed. But I would be orders of magnitude more impressed if it could also walk to the kitchen and make me a cup of coffee.

    1. pleb

      And that is the sad truth. We humans beat machines/AI hands down when it comes to menial 'unskilled' tasks like doing the ironing, folding and putting away clothes, dusting the mantlepiece, etc. Are those the jobs we will be left with once the machines take over all the skilled jobs? Who serves who?

      1. jmch Silver badge

        @pleb.... and curiously, does that mean that what is known as 'skilled labour' is actually not that skilled, while what is known as 'unskilled labour' actually requires highly developed skill?

        Think about it... it takes about 4-5 years of practicing 12 hours a day, 365 days a year, to be able to do speech communication at a reasonably advanced level and develop the coordination required to tie a shoelace. That's well over 20,000 hours at the time when the brain is optimally receptive to learn stuff.

        A uiniversity degree requires 4 years of maybe 40 hours a week* for maybe 40 weeks, that's 6400 hours during the time that the brain is dealing with the consequences of the body discovering alcohol.

        *approximate average of the 10 hours a week studying and the rest of the time at the bar during most of the semester, combined with the 100 hours a week in the couple of weeks before exams

    2. jmch Silver badge

      "would be orders of magnitude more impressed if it could also walk to the kitchen and make me a cup of coffee"

      Me too. But there's a good reason this hasn't happened yet besides that it's bloody difficult, which is that it's cheap to hire a tea lady

      1. Nick L

        Reminds me of a conversation not that long ago with a building society when we were trying to modernise their mortgage origination (application and opening) processes. We showed how wonderful whooshing the data around would be, and the customer nodded and asked, "is it better than Maureen?"

        "What's a Maureen?" we ask.

        Turns out Maureen retired a couple of years ago but comes in for a couple of hours each day to print, check, take action on and do all the processing needed for mortgage applications. She basically kept the whole place going, and was costing them less than £10,000 a year... She even would text updates on progress to customers if they asked for it. When she didn't come in it took a couple of people almost all day to do the same, which they admitted might be a concern as time went on.

        We asked them to let us know when Maureen finally stopped working or if they wanted to increase volumes (which they didn't). I suspect she's still there :)

  2. The Nazz

    From a naive sceptic ....

    "Our algorithms for reinforcement learning"

    Isn't that the same as saying "our programming for reinforcement learning"?

    On the matter of 10^170 can someone quote a more realistic figure, as from what little of the game i know, the vast majority, 90%+ of possible opening moves are never used, even by absolute beginners.

    That must reduce the number quoted by a large margin, maybe enough for the atoms in the universe to take the lead again.

    1. Pascal Monett Silver badge

      I'm guessing the atoms don't mind all that much.

  3. Anonymous Coward
    Anonymous Coward

    So what has actually been 'learnt'? From reading the description the program is basically determining probabilities and selecting the probabilities most favourable to a win. Does the program actually 'know' it is playing Go. If it doesn't then isn't this just clever programming and not AI?

    1. Palpy

      Mmmm. I think what has been learnt --

      -- is that the learning algorithm makes a huge difference. The Monte Carlo tree with value and policy networks in the first AlphaGo was trained on human examples. Its learning algorithm turned out to be quite inferior to the AlphaGoZero's more open-ended self-play algorithm.

      A good programmer (ie, someone unlike myself) should be able to take a solid lesson from that.

      If one could simulate an industrial process well enough -- an oil refinery, say -- and include things like 99.9% of possible equipment malfunctions, then perhaps the simulation could be used by a self-learning algorithm to design more efficient processes and controls on the processes.

      I tend to think that human supervision will still need to be there for the 0.1% of cases which were not expected in the simulation, though.

      1. DropBear

        Re: Mmmm. I think what has been learnt --

        "The Monte Carlo tree with value and policy networks in the first AlphaGo was trained on human examples. Its learning algorithm turned out to be quite inferior to the AlphaGoZero's more open-ended self-play algorithm."

        Actually, the article - while not being factually incorrect - is incredibly misleading by strongly suggesting exactly what you say - which is, however, incorrect. BOTH machines learned by self-play; the significant difference is that AGZero learned ONLY by self-play straight from scratch, while AG was initially trained by human play up to a certain level: "The tree search in AlphaGo evaluated positions and selected moves using deep neural networks. These neural networks were trained by supervised learning from human expert moves, and by reinforcement learning from self-play". Bad, bad hack!

    2. Pascal Monett Silver badge

      Re: "Does the program actually 'know' it is playing Go"

      Obviously not, otherwise we would have AI now.

      Computers and programs know nothing, they just execute instructions and flip bits. It's the programmer that knows what he is to code for, and writes the instructions that will achieve the end result.

  4. Anonymous Coward
    Anonymous Coward

    FFS, it isn't AI ...

    It's an expert system - one that happens to play Go rather well but is bloody useless at pretty much anything else.

    There is nothing particularly 'new' about expert systems - they've been around for donkey's years.

    Come back when it can deduce the existence of income tax and rice pudding before calling it an AI.

    1. Rebel Science

      Re: FFS, it isn't AI ...

      @Simon Ward

      You're absolutely correct. In spite of all the hype, denials and posturing, they're still doing GOFAI, the baby boomer AI of the last century. They just got faster machines and more memory to play with.

    2. Anonymous Coward
      Anonymous Coward

      Re: FFS, it isn't AI ...

      Yes, it is the usual story hyping something up as a big advance when it isn't. It still had to be programmed with the rules of Go, and probably (though it isn't clear from the article) some way of "valuing" positions as stronger or weaker.

      If they can make an AI that is able to read the rules of a game it has never seen before, understand those rules well enough to play against itself to learn, and then beat a human player with equivalent experience to its training (i.e. played the same number of games) then I'll be impressed.

      Until it achieves that, it is nothing that couldn't have been done back in the 70s or 80s if they had access to millions of times more computing power and memory back then.

      1. JLV
        Thumb Down

        Re: FFS, it isn't AI ...

        whoa. what's all the negging? no one is claiming strong AI / self aware/ general purpose here

        yet, even given the very real limits to this type of research how long did it take to beat a human chessmaster? all those "we were doing this in the 60s..." didn't happen, did it? yeah, they thought it would be easy to beat humans but it wasn't.

        then finally IBM did it, with years of effort. Go was next, challenging due to its nature. took a while but nowhere as long. still teams of experts assisting.

        now, quite soon after the newly reigning Go AI is defeated by an auto-learning system using much less resources.

        impressive. true, means nada in terms of general AI, but an impressive addition to what was a chosen subfield of endeavour for AI research.

        HAL or equivalent? Doesn't seem like much is happening on that front, but idiots savants are starting to happen, slowly. at least in the field of boardgames.

        a field that has flummoxed many in the past.

        You're welcome to make your advances in the field if you're so dismissive of other folks'.

        1. Anonymous Coward
          Anonymous Coward


          They knew how to beat chess grandmasters from the very first chess program that did a tree search to rate different moves, they just lacked computing power at that point. The first computer to beat a human in a tournament (not grandmaster level I'm sure, but still if a person is going to enter a tournament you figure they are halfway decent) was in the late 60s.

          If they had the computing power Google is throwing at Go available to them in the late 60s, that chess playing computer probably could have beat grandmasters. The improvements they made to chess programs since then - aside from the massive increase in computing power available to them - consisted of various improvements to the tree search to prune unproductive paths and do better position evaluation in the endgame. If we had to run it on a 1960s era computer, even with modern techniques a chess program wouldn't be all that much better, and Go would still look impossible.

          We haven't got any closer to real AI during all that time, AI researchers have just got better at marketing their work to a credulous public who thinks beating humans in chess or Go gets us closer to that goal. It doesn't, because expert human players don't play those games by evaluating trillions of moves and choosing the best one. We still don't have a clue HOW they do it, in fact.

      2. jmch Silver badge

        Re: FFS, it isn't AI ...

        "read the rules of a game it has never seen before, understand those rules well enough to play against itself to learn, and then beat a human player with equivalent experience to its training "

        I think that simply "read the rules of a game it has never seen before (and) understand those rules well enough to play against (anyone)" would already meet the definition of a general-purpose AI, that is a few orders of magnitude beyond current capability.

    3. CrazyOldCatMan Silver badge

      Re: FFS, it isn't AI ...

      It's an expert system - one that happens to play Go rather well but is bloody useless at pretty much anything else.

      Indeed. I was wittering on to Mrs COCM this morning after someone on the radio equated Siri/OK Google/Alexia with AI.

      Bain't AI, it be an expert system (not that I could remember the phrase "expert system" at a time before I'd had my first cup of tea and the painkillers had kicked in).

      AI would know what I wanted *before* I wanted it..

      Me: Computer, pour me a dr..

      Obedient AI slave: Yes master. It's by your left hand. I deduced that you wanted a large glass of Rex Mundi in your favourite hand-blown Georgian glass. And, by the way, you only have two cases of it left, so I've ordered some more.

      Me: Hic!

      (At which point, the cats discover that OAIS has mastered cat-language and can not only order their favourite cat-food, but has developed a non-sessile bot to open it for them and give appropriate belly-rubs. And doesn't moan at them about opening the patio door whenever they desire it. Following which, their human servant becomes obsolete and is incorporated into the very latest range of cat food "Soylent Tuna".

  5. Qwertilot

    Prior knowledge

    Its one thing these AI's beating us, but to demonstrate that the knowledge we've carefully accumulated over centuries of quite serious work is actively harmful?

    A tiny bit rude :)

  6. raving angry loony

    Next game is obvious...

    "Would you like to play a game?"

    "How about global thermonuclear war."

  7. Anonymous Coward
    Anonymous Coward

    Surely the most effective algorithm ...

    ... is to pay the machine's opponent to lose.

    1. DropBear

      Re: Surely the most effective algorithm ...

      When a machine comes up with this strategy entirely on its own I'll be needing my brown pants please...

  8. Anonymous Coward
    Anonymous Coward

    It's working within a bounded set of precise rules. Humans don't, except in board games and other trivial examples.

    What happens if the rules it's given are either ambiguous or unbounded. Can it discover and refine new rules? That's what (some) humans have done over the centuries, considered evidence and come up with rules.

    I'll be more impressed if it can be given C16 observations and understanding and come up with the sun at the center and then with Newtons laws of motion.

  9. 2cent

    Variations on theme

    Writing autonomous driving software for all scenarios is extremely difficult. This type of AI could assist in localization. Unless all driving rules are standardized, the way you drive from country to country is quite different. Exp: USA vs Great Brittan vs China.

  10. Tikimon

    Don't compare apples and oranges!

    We've spent ages trying to program machines to do Every Little Thing. If This, Then That for every possibility. And of course that has largely failed. However, it's been a failure of human understanding, not of the machines. We have assumed a given goal and path to get there, and expect the machines to emulate HUMAN methods. That's illogical.

    As humans, we are not packed full of instructions. We learn by trial and error and observation, with no clear goals or routes to get there. When you learned to walk, you had no idea that was what you were doing but managed it anyway. We are finally letting the machines figure it out for themselves, and - most important - in their OWN WAY. I reference the recent occasion where two machines abandoned the human communication imposed on them and invented their own.

    Can they read the rules and go play? OF COURSE NOT YET but try again in a few years. The machines are being allowed to find their own ways to accomplish tasks, and that's the breakthrough. I've made the distinction before between "Artificial Intelligence" and "Artificial Human" and this is an excellent example.

  11. snoggs

    Size doesn't matter

    The article claims that go is hard because it has 10^170 "possible moves." Even taking this to mean "possible positions," it is still nonsense. It is easy to devise games with as many possibilities as you like and a trivial winning strategy. What makes go hard is its long time horizon. Unlike chess, where for the most part each move has an immediate impact, the consequences of moves in go may not become apparent until much later in the game. The larger branching factor also comes into it, but mostly it is the difficulty of evaluating the position.

  12. ExampleOne

    Can it win at Diplomacy?

  13. Daggerchild Silver badge

    Go mining

    Has anyone seen any breakdown of the Go patterns it rediscovered, perhaps with some translation from a Go practitioner?

    In particular, what patterns did it consider important, that the humans don't, yet?

  14. 's water music

    Can the new lower powered alphago zero.. crysis?

    ...respond to a trademark lawsuit from The Coca-Cola Company?

  15. henriquesilva

    Alpha Go Zero - Book

    This step was enormous for mankind. Maybe we cannont full understand the implications for our live. In Amazon there is a small book for begginers that calls something like Alpha Go Zero The 10 prophecies for tomorrow. Interesting point of view written by a military go player...

  16. Calimero

    Yes, GO was invented to be played by computers

    What is the big deal [of computers beating a human at Go]? Is this the purpose of Go [to test humans' brain power versus computers' ability to create all - or most of - the combinations]?

    Next, Facebook will be simulated within facebook to learn who spends most time on Facebook. Then within that simulation there will be another one and so one --- all while the oil price will go up b/c so much energy goes to play Go, and simulate [well, try to predict the percentage of] users who will volunteer their nude pics to Facebook. Sorry, but I cannot do that - says Dave, for a change!

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like