back to article More than 1,000 humans fail to beat AI contender in top crossword battle

An AI system has bested nearly 1,300 human competitors in the annual American Crossword Puzzle Tournament to achieve the top score. The computer, named Dr Fill, is the brainchild of computer scientist Matt Ginsberg, who designed its software to automatically fill out crosswords using a mixture of “good old-fashioned AI” and …

  1. Neil Barnes Silver badge

    One of those US crosswords

    With lots of letters to help the matching, and lots of US local knowledge required?

    Let's see how it copes with a UK cryptic - the Times, the Telegraph, the Guardian...

    1. Chris Miller

      Re: One of those US crosswords

      Let alone the Listener (now in the Times).

      1. Giles C Silver badge

        Re: One of those US crosswords

        Here is an example instruction from the listener crossword…(it is from this weeks Times)

        Two clue answers are a letter shorter than their grid entry lengths (given in brackets) and must initially be entered with an empty cell. In 24 other clues the wordplay omits one letter of the answer; these letters read in normal grid order spell two words, A and B. Solvers must find four examples of A (each 6 letters in a straight line) and replace each with one of the other three, representing the outcomes of hypothetical Bs. Finally, solvers must enter the appropriate letter in the empty cell. All final grid entries are real words or phrases.

        Now I personally can’t even make sense of the paragraph above let alone solve the crossword - I would like to see an AI do this.

        1. Yet Another Anonymous coward Silver badge

          Re: One of those US crosswords

          So you're suggesting cryptic crossword clues to replace recognising American parking meters as a captcha ?

        2. Tom 7 Silver badge

          Re: One of those US crosswords

          Old Guardian cryptic: Hark! Sexual deviation, 5,2,4,4.

    2. Anonymous Coward
      Facepalm

      Re: One of those US crosswords

      Indeed. I would have been surprised if it didn't trounce the competition, especially since it trained on crossword dictionaries as well as other word sources. It approaches the banality of a computer winning a spelling bee.

      1. Eclectic Man Silver badge
        Facepalm

        Re: One of those US crosswords

        I very rarely complete the i's 'Five clue' cryptic crossword*. I did finish it yesterday (without looking at the answers first), but I've not solved any clues in today's (1st of May).

        *Yes, I mean the small one with only five clues, I've never got more than 5 answers on the proper full sized cryptic crossword in the same paper :o(

  2. Anonymous Coward
    Anonymous Coward

    Influencers

    “This isn’t just Photoshopping things. It’s making data look uncannily realistic,

    Oh how those self important, jumped up twatty shills that think they are so important that they can flex over and "influence" people into paying for something they were given for free, to advertise and "review"? The "influencers" will be influenced into being all over that software!

    1. tfewster
      Facepalm

      Re: Influencers

      deepfake satellite imagery - Is this really a problem? Would Google Earth or the military use images from untrusted sources? Of course, Google Earth redacts images of sensitive areas themselves..

  3. a_yank_lurker

    Crossword Puzzles

    Crossword puzzles are really tests of vocabulary and interpretation of clues. While they do require some skill, they can solved by an algorithm the uses a dictionary that cross-references clues. Often a partially filled out solution can be 'guessed' in a US crossword puzzle by the fact there are only a couple of words that fit the remaining spaces.

    A properly programmed computer should always beat humans because it is much faster at searching data sources than a human can ever be. This really proves nothing to anyone who has worked with computers and programming.

    1. Ken Moorhouse Silver badge

      Re: they can [be] solved by an algorithm the uses a dictionary that cross-references clues

      Similarly with sudoku, a computer can easily solve it purely by going through the various permutations, as (by customary definition) there's only the one soluton.

      In future, I can see that Prize Sudoku puzzles will be awarded prizes not on the solving of the puzzle itself, but in the sequence in which it was solved, making use of the dependancy logic within the structure of the clues to formulate the result in any given square.

      Of course, a computer can do that too, but there the programmer of the puzzle has proven that he/she has the in-depth knowledge of sudoku to be able to solve it (and arguably is therefore worthy of the prize), whereas the brute-force coder just needs to repeatedly fill in a two-dimensional array with incrementing values, rejecting anomalies as they appear, until the final cell is filled.

      The upshot of this is that, in future, it would not be possible to submit entries for such prizes on paper, only by an app-based adjudicator that observes the puzzle being filled in. I suspect it will be just as much a challenge for the setters of these puzzles to deduce the winner by having to analyse in detail the correct steps for solution: for some puzzles there will be many valid routes to the solution.

      (Anyone else here interested in this area of software development?)

      1. a_yank_lurker

        Re: they can [be] solved by an algorithm the uses a dictionary that cross-references clues

        For a computer solution, a brute force solution is always possible for many puzzle games though it will be slow for a computer but probably much faster than any human. A more elegant algorithm that displays a deeper knowledge of the puzzle by the programmer should be faster yet. But if the competition the elegance of the computer solution is probably not critical.

        1. Ken Moorhouse Silver badge

          Re: if the competition the elegance of the computer solution is probably not critical.

          Unfortunately you may be right. Will there ever come a day when sheer computing power is restricted because it is not in the spirit of competition, or because it is detrimental to the environment?

          The other way to combat taking the computing power shortcut is to create puzzles where it is only possible to solve them using human reasoning due to their Big O complexity. Chess is/was an example, but is/has swiftly being/been eroded by computer power.

          This raises the question as to the fundamental correlation between human reasoning and its synthetic equivalent. Conquer that and it is game over for the human race as we know it. Husbandry dictated by computer, the dictator's wettest of dreams.

      2. Emir Al Weeq

        Re: they can [be] solved by an algorithm the uses a dictionary that cross-references clues

        I once wrote a Sudoku solver using nothing more than VBA in Excel. The logic was not totally brute-force, rather a process of elimination of illegal values, one cell at a time repeated over and over until solved. If it got stuck, it picked a cell with the fewest alternatives and recursively applied the process using each value.

        The machine was a fairly standard domestic box puchased in the age of Windows XP. It could solve the most fiendish puzzles I could find in less than a second.

    2. JulieM

      Re: Crossword Puzzles

      There is a lot to interpreting a crossword clue, like "Social worker carries record player to a church next door" (8). You first have to work out how the word is being clued -- whether it is by reference to the meaning of the word, by reference to its pronunciation, by reference to the letters that make it up, or in some sort of composite fashion. Then you have to deal with the wordplay, which may involve obscure popular culture references. (Would you know what particular animal "Basil's Nest" might be a reference to?)

      I don't think it's a trivial task at all. If all the answers are unadulterated dictionary words, you probably could try to brute-force a set of words into the grid so all the shared letters matched; but it might well not be unique, and it's a coin toss whether you might be able to disambiguate it by making it fit one of the clues. In a really bad case, you might get a false positive match on a wrong answer that throws everything else out. And this method is not going to work for some really complex crosswords such as The Listener, where the clue does not directly indicate the answer but a word that has to be modified in a certain way before writing it into the grid.

    3. gerdesj Silver badge
      Mushroom

      Re: Crossword Puzzles

      A decent cryptic crossword puzzle requires not only deep understanding of grammar and linguistics but an almost encyclopedic knowledge of crossword puzzle "moves". Throw in a really odd dictionary that covers nearly all languages and a few that may not actually exist except out of the corner of your eye.

      These moves (I've made it up) are a bit like chess gambits. You collect a few and get a fairly deep understanding of them but then a new twist turns up. Bugger. Start again. Then some arse of a setter comes up with something new and you are properly screwed for a while.

      I can't solve a good cryptic puzzle without a lot of fuss (and even then probably not) but I do appreciate the skills involved both on the setter and solver's sides. My mum tried to explain some of the methods to me but I'm no good at that sort of thing.

      I suspect the puzzles solved here are effectively finding words that fit. A bit like a space filling problem with letters. That is still a great result and non trivial but this is not "real" crosswords.

      If Hey Goog/Alexa/Siri etc can't manage your lights without turning your house into a deranged disco or a land of obstinate darkness then they will not be turning to the back page and solving a cryptic crossword puzzle.

  4. Ken Moorhouse Silver badge

    Reference the stock image for this article

    What kind of wus uses a pencil to fill in a crossword?

    1. gerdesj Silver badge
      Happy

      Re: Reference the stock image for this article

      "What kind of wus uses a pencil to fill in a crossword?"

      "Pathetic person easily scared by ghostly noise, forgot the second s, perhaps" (3)

      8)

  5. jmch Silver badge

    Unfair comparison

    Crossword AI : 2 CPUs + GPU - approx 200W (and probably trained on a rig consuming Megawatts

    Human brain - approx 20W

    1. Charlie Clark Silver badge
      Coat

      Re: Unfair comparison

      and probably trained on a rig consuming Megawatts

      In which case you should also consider the energy used to train the meatware…

      But the comparison is irrelevant. What is interesting, and why the market is so hyped, is that these kind of puzzles were often the target of the first AI systems in the 1960s, which soon proved too complicated. Some business process map quite well to these kind of challenges, which is why they're starting to be automated.

      Of course, some AI, a bit like nuclear fusion will always remain ten years away…

      1. Tom 7 Silver badge

        Re: Unfair comparison

        And yet, strangely, AI seems to be providing the greatest step forwards in Nuclear Fusion by controlling the plasma for longer periods than ever before!

  6. JulieM

    How does a computer solve crosswords?

    How do you even begin to get a computer to solve a crossword? I mean, something like "Johnny rents out a flat in Paris! (6, 6)" or "Trains, whether in whole or in part" (7) is hard enough for a human. And those are just entry-level, double definition clues. Just wait till you get onto sound-alikes and awkward letter-by-letter constructions.

    I suppose you could just about get used to a particular compiler's preferred style of wordplay, and pick up on common themes they used -- maybe they always or very often use a particular phrase to indicate a particular construction.

    Or you could cheat; ignore the clues altogether to begin with, and just search for dictionary words that fit with each other in the grid. Then you might be able to reconcile one or more of the more obvious clues closely with one of the possible answers.

    But deciphering crossword clues always struck me as being a very human thing that just would not be at all easy for a machine. It's effectively reverse-engineering extreme poetry.

    Still, all kudos to anyone who has managed to pull it off .....

    1. William Towle
      Thumb Up

      Re: How does a computer solve crosswords?

      > deciphering crossword clues always struck me as being a very human thing that just would not be at all easy for a machine. It's effectively reverse-engineering extreme poetry.

      At University I decided the opposite - for a human, the first obstacle is ignoring what looks like straightforward plain English and then what remains involves interpreting the alternative grammar of what's there (I once told a dyslexic friend with a newspaper over lunch that "it's not English. The majority of clues are like equations, with two halves producing matching answers; some words are operators, and others are operands - some used verbatim and some like variables", and after a little further elaboration it clicked for him and we polished off the whole thing together). That latter interpretation stage struck me as very much what a classic expert system does.

      I agree that the possibility of soundalike words and so on adds complexity to the parsing side problem, but that's just peanuts to processor grunt ... if you can seed the dictionary and grammar engine in the first place - and I suppose that's where the modern AI/ML/big data side of it comes in. 20 plus years ago I wouldn't have liked to train a system by hand - there's a reason my final year project was something else!

    2. Martin
      Unhappy

      Re: How does a computer solve crosswords?

      OK - I'm not an expert by any means - I can normally get about half of the Observer Everyman crossword.

      But you called them entry-level, double definition clues. So I thought I'd have a chance. But I've stared at those clues and I can't make any headway.

      Johnny rents out a flat in Paris! (6, 6)

      I can see that's an anagram on "Johnny Rents" and the clue is "a flat in Paris" - or just possibly the other way round - but I can't get it.

      Trains, whether in whole or in part (7)

      Trains as in teaches, or possibly as in engines, but the other bit? Don't know.

      Could you please give us a clue to the clues?

      I agree - I suspect something like the Times Crossword is going to be one of the last holdouts for computer solutions.

      1. JulieM

        Re: How does a computer solve crosswords?

        Could you please give us a clue to the clues?
        I thought I had! ;) Anyway, this is as close as I can take you to the actual answers without a blatant spoiler, so you can still get to feel the light bulb coming on.
        Johnny rents out a flat in Paris! (6, 6)

        I can see that's an anagram on "Johnny Rents" and the clue is "a flat in Paris" - or just possibly the other way round - but I can't get it.

        That's because you are partitioning the clue wrongly. The first definition is just "Johnny" and the second is "Rents out a flat in Paris". "Parisian landlord" is too long by itself (and why would he called Johnny specifically?) Remember English has plenty of prefix-suffix combinations that would be perfectly sensible constructions, but are not used -- or at least are not used in all senses -- in normal conversation. For instance, "drawer" always refers to a sliding storage compartment in a piece of furniture, and never -- at least, not outside the mind of a crossword setter -- say, an artist, or an oscilloscope (although neither sense is strictly invalid except by habit, and would certainly be understood from the mouth of a young child or a non-native speaker). This is one possible example of somewhere a computer might actually have a slight advantage over a human, if it has not already assigned too low a probability score to some search paths. What does a landlord do? They let out property. And Paris is the capital of France. Now go searching depth-first, literally -- start in the gutter .....
        Trains, whether in whole or in part (7)

        Trains as in teaches, or possibly as in engines, but the other bit? Don't know.

        You are on the right lines (pun definitely intended) with two meanings for "trains". Think "trains in whole" = gives instruction, and "trains in part" = parts of (railway) trains. "Carriages" is too long, "cars" or "wagons" is too short, and anyway neither of those can be reconciled with the first part of the clue .....

        1. Martin
          Happy

          Re: How does a computer solve crosswords?

          OK - the light has gone on for both of them! Thanks.

          The first one is clever. Though I don't feel too upset I didn't get it - I think that an anagram on "Johnny rents" (or "a flat in Paris) could easily be indicated by the "out", and the fact that both "Johnny rents" and "a flat in paris" have twelve letters led me seriously astray.

          The second one I really should have got. I admit I cheated - I typed the clue into Google and found "Train, or part of one (5)" - and that's basically the same clue.

          Thanks very much for your help. Just confirms to me that I'm not really a crossword person. I'll stick to Sudoku - I'm good at those!

    3. Tom 7 Silver badge

      Re: How does a computer solve crosswords?

      My mum used to muller the Telegraph crossword in less than ten minutes. As a science geek who gave up on word shit quickly I never really bothered with that sort of thing. Then one night after we'd had a few drinks when I was back from uni dad brought the paper back from work and I sat next and asked her what she was thinking as she did it. It seemed the compiler left very easy to spot clues which really stood out if you didnt really read the words. Once you;d spotted those that merely left the anagrams which are really tricky after half a bottle of Talisker. I cant remember the name of the compiler but I found his quite easy after that if I found a copy at reception at job interviews after that.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like