back to article Google claims Big Sleep 'first' AI to spot freshly committed security bug that fuzzing missed

Google claims one of its AI models is the first of its kind to spot a memory safety vulnerability in the wild – specifically an exploitable stack buffer underflow in SQLite – which was then fixed before the buggy code's official release. The Chocolate Factory's LLM-based bug-hunting tool, dubbed Big Sleep, is a collaboration …

  1. Richard 12 Silver badge

    How many false positives though?

    It rather makes me wonder if it simply listed every assert()

    1. Anonymous Coward
      Anonymous Coward

      Infinite Number of Monkeys

      Syndrome.

      1. Bebu sa Ware
        Windows

        Re: Infinite Number of Monkeys

        Just read recently that apparently not: monkeys-do-not-have-the-time.

        Although clearly you might train in finite time a single (surviving) monkey to transcribe Shakespeare's Hamlet using a typewriter. Think brain implants and lots of pain and a adequate supply of monkeys.

        1. O'Reg Inalsin

          Re: Infinite Number of Monkeys

          Forget training, just let them evolve.

        2. Anonymous Coward
          Anonymous Coward

          Re: Infinite Number of Monkeys

          Note that the "Infinite Number of Monkeys" specifies multiple conditions:

          1. Infinite number of monkeys

          2. Infinite number of typewriters

          3. Infinite amount of time

          The linked article starts with pointing out that #3 isn't true. (Which is obvious, along with the lack of an infinite number of monkeys and typewriters.)

          The point of the thought example, however, is absolutely true - given enough random data, sooner or later some piece of it is going to look "interesting". See, for example, Nostradamus' "prophecies".

  2. Anonymous Coward
    Anonymous Coward

    Fuzzing?

    I'm struggling to think of how any adequate fuzzing wouldn't have caught this. If the fuzzing was done and didn't catch this one, then I'd worry about all the other similar problems that neither the fuzzing nor the AI caught.

    1. Clausewitz4.0 Bronze badge
      Devil

      Re: Fuzzing?

      "how any adequate fuzzing wouldn't have caught this"

      They need a few more billions in VC money, so put some fuzzers and researchers to find vulns, and tell the world it was an snake oil AI who found it. Sell the silver bullet, stocks for a few more billions before the bubble bursts. Like IronNet or Darktrace. And hire a former NSA officer to be part of the board.

  3. Bebu sa Ware
    Windows

    Removing assert() in production

    I recall many, many years ago reading in the work of one of Djisktra, Brinch Hansen or Wirth that removing bounds checks or correctness assertions (eg invariant checks) from production code was akin to training with parachutes but flying into combat without.

    The only logical reason for removing assert()s from production code is where it is proven that b-expr in assert(b-expr) evaluates to the constant true.

    Decent compiler/linker for a language less shabby than most could probably optimise away most assertions.

    Using magic values outside the domain (of an array, of a function...) can frequently end in tears - nul terminated strings are probably responsible an ocean or two, (-1) for 0-based arrays a couple of lakes (array indices are often unsigned and if are 32 bit on a 64 bit system (-1) is interpreted as a large positive offset (2<sup>32</sup>-1) while being in the boonies could be a valid location in the process' memory.)

    1. Richard 12 Silver badge

      Re: Removing assert() in production

      Performance matters.

      The compiler generally cannot prove that the function can only be called from functions that have already verified the invariants, so you end up with a long series of identical checks in a single callstack.

      Eg checking bounds every list.at(x) call in a loop that is already doing x < list.size() every iteration is pointless.

      You do however want be sure that you do have the checking somewhere in the call chain.

      So you use a proper if (bad) report_failure() at the top of the callchains, with duplicate assert() further down in debug and automated testing, and remove the duplicate asserts() in production once you've sufficiently proven they are never hit.

      1. m4r35n357 Silver badge

        Re: Removing assert() in production

        So does security and correctness.

        You can risk speculative execution for performance if you are sure there will never be a viable attack.

        You can say that you understand all your possible failure modes, and risk removing the asserts yourself at runtime.

        You can risk putting in the asserts, in the hope that someone doesn't come along later and mess with one of your make files.

        1. Richard 12 Silver badge

          Re: Removing assert() in production

          Checking the right thing once is secure and correct.

          Checking the same thing 1,000,000,000,000,000,000 times is a waste of time, and can even make the application effectively unusable.

          That's why the release and debug versions of the MSVC STL are incompatible. The debug version does a lot of additional bookkeeping checks.

          As for "messing with makefiles" - there's a reason for "death" and "death in debug" tests. If those start failing you know something very silly has happened.

  4. m4r35n357 Silver badge

    Amateur c coder here (testing el Reg's code tag)

    Roll your own asserts, so compiler flags can't fuck things up. Colour codes optional.

    /*

    * Colours

    */

    #define GRY "\x1B[1;30m"

    #define RED "\x1B[1;31m"

    #define WHT "\x1B[1;37m"

    #define NRM "\x1B[0m"

    /*

    * Unavoidable "assert", in colour

    */

    #define CHECK(x) do { \

    if(!(x)) { \

    fprintf(stderr, "%sFAIL %s%s %s%s%s %s%s:%s%i\n", RED, WHT, #x, GRY, __func__, NRM, __FILE__, GRY, NRM, __LINE__); \

    exit(1); \

    } \

    } while (0)

    Comments welcome, for starters, my lovely whitespace is all ruined by posting ;)

    1. m4r35n357 Silver badge

      Re: Amateur c coder here (testing el Reg's code tag)

      Just to mention the code above is standard c99. Don't forget the:

      #include <stdlib.h>

      of course . . . .

      1. Richard 12 Silver badge

        Re: Amateur c coder here (testing el Reg's code tag)

        exit(1) means it's completely impossible to debug.

        In most real applications STDERR goes to /dev/null. Even if it doesn't, all you'll get is a function name - not how it got there.

        So all you know is that it failed!

        At the very least, get a stacktrace if not a full core!

        1. m4r35n357 Silver badge

          Re: Amateur c coder here (testing el Reg's code tag)

          Hi Richard, I hope you noticed the bit where I said "amateur"!

          1. I like to fail fast, not debug. I know where the data comes from in my program, because I don't over-complicate things.

          2. stderr only goes to /dev/null it you tell it to! File name, function name & line is pretty useful information.

          3. It is designed to work just like assert(), not produce some wizzy object-oriented stack trace (I don't know how to do that in c anyway!).

          1. Richard 12 Silver badge

            Re: Amateur c coder here (testing el Reg's code tag)

            Then this is a learning opportunity!

            Always make programs debuggable. It is more important than being correct, because everyone makes mistakes.

            The minimal 'nix way is to signal abject failure, usually SIGABORT (eg by calling abort()), so an attached debugger can trap the signal and thence gdb (or whatever) can help you find what went wrong.

            If you take a look, you'll probably find that's what assert() does (when enabled)

            1. m4r35n357 Silver badge

              Re: Amateur c coder here (testing el Reg's code tag)

              Haha maybe 40 years ago ;)

              FWIW my code is tiny, and correct - and I do not tolerate bugs or compiler warnings ;)

              I have followed your suggestion however, just in case!

  5. tekHedd

    This bug happened because misusing asserts is the norm

    You're supposed to assert things that can't happen. This has two consequences:

    * If one fails, it means your assumptions are false, and this is something you should not be able to ignore

    * Removing them should have no side effects, because they are testing impossibilities

    If one of these does not apply to your assertion, you are doing it wrong.

    assert(false) should MESS #%(*&k UP! Assert failure is supposed to point out an incorrect assumption immediately, so you will catch it as early as possible. It should do something you can't ignore in your test suite.

    You should not be testing your asserts. They are impossibilities. If your expected input is expected to trigger an assert() , it should not be an assert.

    This isn't rocket science, it's 40 year old established coding basics.

  6. Sora2566 Silver badge

    I suppose even a broken clock is right twice a day...

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like