Computer generated training images
Slight problem with self-driving cars that all head toward the spinning Nvidia logo on billboards
The saying "data is the new oil," was reportedly coined by British mathematician and marketing whiz Clive Humby in 2006. Humby's remark rings true more now than ever with the rise of deep learning. Data is the fuel powering modern AI models; without enough of it the performance of these systems will sputter and fail. And like …
But they don't like "Fake it until you make it" since Elizabeth Holmes & Theranos. Especially when you can potentially kill people of give them the wrong medical information because of it.
Still, they never learn, and silicon valley still continues to allow this and invest in bollocks.
As we've seen with some AI in the labs. Just because it works how it was expected in the lab doesn't mean it will behave the same when live. Such as the maze hunters trained on. I'm no expert, I'm going off the Robert Miles videos. But the object where picking up keys and using them to open chests is good. But when put the AI out into the wild the AI ended up just picking up keys only cause there were now more keys than chests, the AI behaviour had changed from when in the test lab. It decided it liked keys more and chests were OK but it loved keys. It could see its own key inventory with one chest left but got stuck trying to pick up the keys in its own inventory.
Robert explains it better than I ever could.
https://youtu.be/zkbPdEHEyEI
"we can generate whatever distribution of ethnicities, ages, genders you want in your data, so we are not biased in any way"
The moment you specify a distribution up front, you implement a bias (whether or not you're smart enough to recognise that), because your specification is based on your prior expectation.
The reason for random sampling from a real population is that you can't have any prior expectation. Statistics 101.
> The reason for random sampling from a real population is that you can't have any prior expectation.
Quite true - but even then you can still have bias, because your "random" sampling protocol is biased (in ways you may be unaware of), or simply because the population you're sampling from has a highly-complex, multi-modal distribution and your sample size is too small.