back to article DeepMind uses matrix math to automate discovery of better matrix math techniques

Google-owned DeepMind has applied reinforced learning techniques to the multiplication of mathematical matrices, beating some human-made algorithms that have lasted 50 years and working toward improvements in computer science. Founded in London in 2010, DeepMind has become famous for beating the world champion at board game Go …

  1. gerryg

    maths - with an S

    The snarky valedictory about alpha-fold seems a bit unnecessary. Since when has more information been less useful than less information? The recently published paper by MIT seems to suggest that it's a work in progress.

    Moreover either matrix multiplication has been improved or it hasn't. That seems a bit easier to test than new drug discovery.

    In some ways it's a little refreshing to discover that reality it a bit more challenging than can be solved by bucket loads of AI.

    1. LionelB Silver badge

      Re: maths - with an S

      "Moreover either matrix multiplication has been improved or it hasn't."

      It's not quite as straightforward as that, at least for multiplication of non-square matrices, where the scaling of computational complexity depends also on the shape of the matrices in questions, so comparison becomes difficult. Furthermore, the practical scaling of efficiency of an algorithm, even for square matrix multiplication, depends on the hardware on which it's run (e.g., CPU, GPU, FPGA, etc.), and how the algorithm is implemented on the hardware (e.g., to exploit parallelism, cache sizes, memory access, and so on).

      I had a skim through the paper, and it's actually pretty impressive (albeit, as they say, a work-in-progress). They achieve real reductions of multiplication operations on small matrices—which become building-blocks for large matrix multiplications—against state-of-the-art known algorithms. Perhaps more importantly, they are also able to optimise algorithms for specific hardware.

    2. Spoobistle

      Re: AlphaFold

      Even if AlphaFold were a silver bullet, expecting new drugs this soon would be a bit premature - there's a lot more work than the protein structure. However AlphaFold is becoming a productive tool for structural biologists - a structure prediction having an indication of confidence is helpful in tailoring the wet lab experiments towards avenues likely to succeed. As with many things, ignore the hype and find out what it can realistically do.

  2. elDog

    I'm hoping that these superbots can tell us the meaning of '42'

    And do it within a current human's lifetime. I understand that 42 was the answer to a silly question. Now I want to know what 42 really means - beyond that silly question and lots of maths.

    Aren't we in the realm of designing an AI that can ask another AI a question that can't be answered - even if both are considered the "god" AI?

    (This is based on the question: Can god make a stone so heavy that god can't lift it?)

    ..... Knew I shouldn't have had a glass with my lunch ...

    1. The Oncoming Scorn Silver badge
      Thumb Up

      Re: I'm hoping that these superbots can tell us the meaning of '42'

      DEEP THOUGHT: Exactly. Now that you know that the answer to the Ultimate question of Life, the Universe, and Everything is forty-two, all you need to do now is find out what the Ultimate Question is.

      “There is a theory which states that if ever anyone discovers exactly what the Universe is for and why it is here, it will instantly disappear and be replaced by something even more bizarrely inexplicable.

      There is another theory which states that this has already happened.”

  3. John Smith 19 Gold badge

    Historically it was believed you needed n^3 operations to multiply 2 square matrices together

    In the early 70's Strassen took that down to something like n^2.8.

    As of 2022 its 2.3728596.

    BTW Doug Lenat was using heuristic AI in a program called "Eurisko" in the 80's to do mathematical discovery IE to discover various laws of mathematics from basic axioms.

    So yes, better than what's taught in standard level textbooks is possible. Apparently it costs a lot more additions, but those are relatively cheap compared to mults.

    Can the AI do better than this?

    Fooknose. I've seen so many BS AI claims over the years I doubt it's anywhere near as good as they are claiming.

    1. Tessier-Ashpool

      Re: Historically it was believed you needed n^3 operations to multiply 2 square matrices together

      Thanks for the link. Interesting article. I’d not thought of treating matrix elements as matrices themselves but of course it makes sense if you think about it. I wonder how this relates to the way that the universe itself works out what’s going on. Quantum mechanics is stuffed full of matrix operations and every part of the universe is said to be intimately connected with every other part.

      1. John Smith 19 Gold badge

        Thanks for the link.

        Happy to do so.

        We will always know more collectively than any one person can know alone, so it makes sense to share the knowledge.

    2. PRR Bronze badge

      Re: Historically it was believed you needed n^3 operations to multiply 2 square matrices together

      > early 70's ....something like n^2.8. As of 2022 its 2.3728596.

      So OTOO 236/631 or 1/2.67 in 40 years.

      Moore's Law (double size/speed/cost every 1.5 years) suggests 1/26,700,000 the time to do the deed. Ten million times more speed-up from CPU commercial advance compared to slicker techniques.

      So "simple" (not!) smaller features and larger device-counts seems to beat "smarts" all hollow.

      1. John Smith 19 Gold badge

        "So "simple" (not!) smaller features and larger device-counts seems to beat "smarts" all hollow."

        You don't appear to have heard of "Big O " notation. :-(

        If the size of the problem is big enough it will still bring the SoA hardware to its knees. That's where this comes in. The problem size you can tackle with better algorithm goes up regardless of how fast the cores (or how many of them) there are to begin with. It's like the Fast Fourier Transform's effect on signal processing software. Turning overnight runs into minutes of computer time.

        CFD problems (for example) use lots of matrix math and continue to bring SoA HPC arrays to their knees.

        1. PRR Bronze badge

          The title is too long.

          > If the size of the problem is big enough it will still bring the SoA hardware to its knees.

          Obviously. (*)

          So which will work sooner? Thinking to streamline the algorithm? Or just building a bigger hammer?

          History of FFT is interesting. It burbled for like 150 years. Then all in a few years it got "fast". While the technique is still studied, it has not got a heap faster since 1965 era. Meanwhile our 'affordable' hardware has gotten a billion times faster.

          There is a finite set of "greatly improvable" problems. FFT was one. Sorting, QuickSort, is another less dramatic "overnight" improvement.

          We don't even know what is the lower bound on the complexity of fast Fourier transform algorithms.

          Finding the short-path for a specific problem may suggest improvements in a few related problems, but not for "all" problems. OTOH faster/wider computer guts speeds ALL problems.

          I've been putzing with computation for 60 years. I've lived through the "breakthroughs" and watched the 'inevitable' constant speed-up of computers. And the many headlines "The End of Moore's Law!!" Back in the 1960s the pundits pointed out that the speed of light limited performance to roughly a nano-second, because how could you build a ALU/CPU smaller than about a foot across?

          (*) BTW, the size of your title, plus the customary "Re: ", brought the SotA Register hardware to its knees.

    3. LionelB Silver badge

      Re: Historically it was believed you needed n^3 operations to multiply 2 square matrices together

      Why not actually look at the paper they've published? It tells you exactly how good it is.

    4. rerdavies

      Re: Historically it was believed you needed n^3 operations to multiply 2 square matrices together

      To be honest, the improvements on Strassen are a curiosity without real practical application, since multiplies are not more expensive than additions anymore on GPUs and TPUs due to deep pipelining.

      The more practical and interesting result is the ability to optimize matrix multiplies on actual state-of-the -art SIMD hardware, where additions cost the same as multiples and the bigger challenge is cache-line read efficiency.

      However, the comparisons against Strassen speak to the startling generality of the technique. It can not only provide practical optimization on real hardware, but also makes three concrete documented improvements on the best results of decades of academic work on the problem. A curiosity that speaks to the generality of the method.

  4. Anonymous Coward
    Anonymous Coward

    Google/DeepMind........and 1.6 Million Personal Medical Records.....

    .....slurped from the Royal Free Hospital.

    GDPR.....I hear you say? Well....GDPR required the consent of those 1.6 million citizens for the slurp to take place.

    How much consent did Google/DeepMind actually ask for? How about none! Not one citizen consented to the slurp.

    So this piece in El Reg might be good publicity.....which (obviously) Google/DeepMind really need.....since they have lots of history stealing data about citizens (see above).

    But this El Reg piece is still lipstick on a pig!!!!

  5. Justthefacts Silver badge

    Not matrix multiplication of normal numbers

    This has been mis-sold. It’s not matrix multiplication of normal numbers. It’s matrix multiplication over Galois fields of characteristic 2. See details here:

    Short version of Galois field 2: it’s essentially one-bit arithmetic, in which case I’m not sure why focusing on multiplications (aka AND gates) as opposed to additions (aka XOR gates) is very interesting.

    Longer version - it’s probably extensible to GF extensions 2^n, which are used in specialist applications like AES crypto, and Reed Solomon coding. AES does have a 4x4 matrix multiplication in it, although whether it really makes a significant difference to any implementation I’m dubious.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like