back to article Fake it until you make it: Can synthetic data help train your AI model?

The saying "data is the new oil," was reportedly coined by British mathematician and marketing whiz Clive Humby in 2006. Humby's remark rings true more now than ever with the rise of deep learning. Data is the fuel powering modern AI models; without enough of it the performance of these systems will sputter and fail. And like …

  1. Yet Another Anonymous coward Silver badge

    Computer generated training images

    Slight problem with self-driving cars that all head toward the spinning Nvidia logo on billboards

  2. Anonymous Coward

    Faux Data

    Synthetic data seems to have some advantages for some applications.


    Regardless of the data, the problem remains that the "Intelligence" itself is, and will remain, fake.

  3. sreynolds

    I've always had a problem with the new oil expression

    Oil helped to produce lots of cheap crap for the masses, whereas data is used to push more cheap crap on the masses.

  4. steviebuk Silver badge

    I know its not what the article is about

    But they don't like "Fake it until you make it" since Elizabeth Holmes & Theranos. Especially when you can potentially kill people of give them the wrong medical information because of it.

    Still, they never learn, and silicon valley still continues to allow this and invest in bollocks.

    As we've seen with some AI in the labs. Just because it works how it was expected in the lab doesn't mean it will behave the same when live. Such as the maze hunters trained on. I'm no expert, I'm going off the Robert Miles videos. But the object where picking up keys and using them to open chests is good. But when put the AI out into the wild the AI ended up just picking up keys only cause there were now more keys than chests, the AI behaviour had changed from when in the test lab. It decided it liked keys more and chests were OK but it loved keys. It could see its own key inventory with one chest left but got stuck trying to pick up the keys in its own inventory.

    Robert explains it better than I ever could.

    1. Stumo

      Re: I know its not what the article is about

      I was interested to see more on these "keys and chests" videos, but my googling hasn't been able to find it - are you able to provide a link?

  5. Mike 137 Silver badge


    "we can generate whatever distribution of ethnicities, ages, genders you want in your data, so we are not biased in any way"

    The moment you specify a distribution up front, you implement a bias (whether or not you're smart enough to recognise that), because your specification is based on your prior expectation.

    The reason for random sampling from a real population is that you can't have any prior expectation. Statistics 101.

    1. LionelB Silver badge

      Re: Unbiased?

      > The reason for random sampling from a real population is that you can't have any prior expectation.

      Quite true - but even then you can still have bias, because your "random" sampling protocol is biased (in ways you may be unaware of), or simply because the population you're sampling from has a highly-complex, multi-modal distribution and your sample size is too small.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like