A neural network that memorizes data ?
Now why does that sound familiar ?
Private information can be easily extracted from neural networks trained on sensitive data, according to new research. A paper released on arXiv last week by a team of researchers from the University of California, Berkeley, National University of Singapore, and Google Brain reveals just how vulnerable deep learning is to …
This isn't really too surprising, but I'm not sure if 'remembering' is really the correct term.
I recall an earlier story here on El Reg where someone took an AI trained to identify cats in photographs and then got it to generate a picture from the features it had learned and which it used to make identifications. The picture was undoubtedly 'cat' but wasn't a picture of 'a' cat - iirc, there was no background to the picture, just edge-to-edge 'aspects' of 'cat', seamlessly blended together.
Now if that can be done with imagery as complicated as a cat then it seems likely that generating valid strings of numbers from an AI trained to identify valid strings of numbers shouldn't be beyond the bounds of possibility.
I was thinking "But... but... but..." until the final paragraph: "never feed secrets as training data".
Quite, it's live data, not training data. The clue is already in the terminology.
In fact, did my sensitive data sign up to be used for training purposes? I think not.
Humans in particularly sensitive activities have long operated on a need-to-know basis.
Occasionally we hear of a dog or other animal being killed because it knows too much and would reveal something to an enemy.
If the "I" in AI is to mean anything, we're into the same situation.
 Including some that are sensitive only because the glare of publicity would reveal monumental waste of taxpayer funds, and such things.
They are really just a specialist type of database. No surprise that what someone puts in can somehow be "read" by someone else.
We've had SQL for ages and look how often NEW plugins for Web Applications allow data extraction?
A.I. boffins, shocked and surprised by latest observations, effectively announce that they actually don't really understand what's going on inside neural networks.
The phrase "haven't got the first clue yet" springs to mind.
Is there an explanation for their apparent dough-headedness?
"don't really understand what's going on inside neural networks"
I really wonder about this. Back in my days working on knowledge capture systems, one of the requirements was an 'explain' function. Which rules were fired and which instances of training data supported each rule. Even better, we could derive a veracity score for each data set source and up/down rank them based upon their contribution to eventual successful solutions. Humans, on the other hand, tend to generalize their training (which can be good). But then forget exactly why they came to certain conclusions. Computers don't forget anything.
Admittedly, we had a small set of data compared to what Google has to deal with. So there was a copy on our server.
The book 'Blink' by Malcolm Gladwell outlines the development and operation of this largely subconscious process.
There is no such thing as non-confidential personal data. All data is confidential because it can be combined to fill in whatever is missing. Much of the 'anonymised' data sold by Google et al simply has names removed. But if you correlate it with another source, you can easily de-anonymise it. E.g. location data can be cross referenced with publicly available names and address then simply searched for where you go at the end of each day. Similar techniques can be used to extract full credit card numbers from just the last four digits. AI is already doing this in an unstructured way so it's inevitable that its memory will contain confidential information even if it isn't fed any. The larger the database, the sooner this happens.
Qualcomm knows that if it wants developers to build and optimize AI applications across its portfolio of silicon, the Snapdragon giant needs to make the experience simpler and, ideally, better than what its rivals have been cooking up in the software stack department.
That's why on Wednesday the fabless chip designer introduced what it's calling the Qualcomm AI Stack, which aims to, among other things, let developers take AI models they've developed for one device type, let's say smartphones, and easily adapt them for another, like PCs. This stack is only for devices powered by Qualcomm's system-on-chips, be they in laptops, cellphones, car entertainment, or something else.
While Qualcomm is best known for its mobile Arm-based Snapdragon chips that power many Android phones, the chip house is hoping to grow into other markets, such as personal computers, the Internet of Things, and automotive. This expansion means Qualcomm is competing with the likes of Apple, Intel, Nvidia, AMD, and others, on a much larger battlefield.
Comment More than 250 mass shootings have occurred in the US so far this year, and AI advocates think they have the solution. Not gun control, but better tech, unsurprisingly.
Machine-learning biz Kogniz announced on Tuesday it was adding a ready-to-deploy gun detection model to its computer-vision platform. The system, we're told, can detect guns seen by security cameras and send notifications to those at risk, notifying police, locking down buildings, and performing other security tasks.
In addition to spotting firearms, Kogniz uses its other computer-vision modules to notice unusual behavior, such as children sprinting down hallways or someone climbing in through a window, which could indicate an active shooter.
Microsoft has pledged to clamp down on access to AI tools designed to predict emotions, gender, and age from images, and will restrict the usage of its facial recognition and generative audio models in Azure.
The Windows giant made the promise on Tuesday while also sharing its so-called Responsible AI Standard, a document [PDF] in which the US corporation vowed to minimize any harm inflicted by its machine-learning software. This pledge included assurances that the biz will assess the impact of its technologies, document models' data and capabilities, and enforce stricter use guidelines.
This is needed because – and let's just check the notes here – there are apparently not enough laws yet regulating machine-learning technology use. Thus, in the absence of this legislation, Microsoft will just have to force itself to do the right thing.
Interview In June, Purism began shipping a privacy-focused smartphone called Librem 5 USA that runs on a version of Linux called PureOS rather than Android or iOS. As the name suggests, it's made in America – all the electronics are assembled in its Carlsbad, California facility, using as many US-fabricated parts as possible.
While past privacy-focused phones, such as Silent Circle's Android-based Blackphone failed to win much market share, the political situation is different now than it was seven years ago.
Supply-chain provenance has become more important in recent years, thanks to concerns about the national security implications of foreign-made tech gear. The Librem 5 USA comes at a cost, starting at $1,999, though there are now US government agencies willing to pay that price for homegrown hardware they can trust – and evidently tech enthusiasts, too.
Analysis After re-establishing itself in the datacenter over the past few years, AMD is now hoping to become a big player in the AI compute space with an expanded portfolio of chips that cover everything from the edge to the cloud.
But as executives laid out during AMD's Financial Analyst Day 2022 event last week, the resurgent chip designer believes it has the right silicon and software coming into place to pursue the wider AI space.
In brief US hardware startup Cerebras claims to have trained the largest AI model on a single device powered by the world's largest Wafer Scale Engine 2 chip the size of a plate.
"Using the Cerebras Software Platform (CSoft), our customers can easily train state-of-the-art GPT language models (such as GPT-3 and GPT-J) with up to 20 billion parameters on a single CS-2 system," the company claimed this week. "Running on a single CS-2, these models take minutes to set up and users can quickly move between models with just a few keystrokes."
The CS-2 packs a whopping 850,000 cores, and has 40GB of on-chip memory capable of reaching 20 PB/sec memory bandwidth. The specs on other types of AI accelerators and GPUs pale in comparison, meaning machine learning engineers have to train huge AI models with billions of parameters across more servers.
Period- and fertility-tracking apps have become weapons in Friday's post-Roe America.
These seemingly innocuous trackers contain tons of data about sexual history, menstruation and pregnancy dates, all of which could now be used to prosecute women seeking abortions — or incite digital witch hunts in states that offer abortion bounties.
Under a law passed last year in Texas, any citizen who successfully sues an abortion provider, a health center worker, or anyone who helps someone access an abortion after six weeks can claim at least $10,000, and other US states are following that example.
American lawmakers held a hearing on Tuesday to discuss a proposed federal information privacy bill that many want yet few believe will be approved in its current form.
The hearing, dubbed "Protecting America's Consumers: Bipartisan Legislation to Strengthen Data Privacy and Security," was overseen by the House Subcommittee on Consumer Protection and Commerce of the Committee on Energy and Commerce.
Therein, legislators and various concerned parties opined on the American Data Privacy and Protection Act (ADPPA) [PDF], proposed by Senator Roger Wicker (R-MS) and Representatives Frank Pallone (D-NJ) and Cathy McMorris Rodgers (R-WA).
In Brief No, AI chatbots are not sentient.
Just as soon as the story on a Google engineer, who blew the whistle on what he claimed was a sentient language model, went viral, multiple publications stepped in to say he's wrong.
The debate on whether the company's LaMDA chatbot is conscious or has a soul or not isn't a very good one, just because it's too easy to shut down the side that believes it does. Like most large language models, LaMDA has billions of parameters and was trained on text scraped from the internet. The model learns the relationships between words, and which ones are more likely to appear next to each other.
California lawmakers met in Sacramento today to discuss, among other things, proposed legislation to protect children online. The bill, AB2273, known as The California Age-Appropriate Design Code Act, would require websites to verify the ages of visitors.
Critics of the legislation contend this requirement threatens the privacy of adults and the ability to use the internet anonymously, in California and likely elsewhere, because of the role the Golden State's tech companies play on the internet.
"First, the bill pretextually claims to protect children, but it will change the Internet for everyone," said Eric Goldman, Santa Clara University School of Law professor, in a blog post. "In order to determine who is a child, websites and apps will have to authenticate the age of ALL consumers before they can use the service. No one wants this."
Biting the hand that feeds IT © 1998–2022