
How does it work?
It simply gives it a Brummie, Glaswegian or some other strong accent... guaranteed to defeat any AI
(listen to 'Breaking the News' from BBC Scotland... the only podcast to come with subtitles)
The thought that our gadgets are spying on us isn't a pleasant one, which is why a group of Columbia University researchers have created what they call "neural voice camouflage." This technology won't necessarily stop a human listener from understanding someone if they're snooping (you can give recordings a listen and view …
> any reasonable British accent will defeat these American speech "recognition" programs
It certainly defeats mine!... I still remember when I first arrived in the UK (30-40 years ago) and tried to order something in a pub. The landlord replied something to me in a language I'd never heard before... I don't know if he was just having fun with the foreigner, but that didn't sound anywhere related to English as I know it.
I seem to recall that a simple white noise generator (check V for Vendetta, among others), aka the go-to spy covert-conversation-protector, is largely enough to confuse a microphone while retaining human ear's capability of listening to one's neighbour.
Have physics changed, or has that always been a red herring ?
White noise is a brute force attack - to generate white noise that will mask you from a microphone but still allow conversation the white noise needs to be directional towards the microphone or generated at an intervening surface (e.g. window pane). This is a much more subtle approach. It is also not too hard to remove white noise - its what noise cancelling headphones and the background noise cancellation on videoconferencing applications do all the time.
This is an academically interesting technique because it is relying on generating more specific "anti-noise" than a brute force white noise approach. How useful it is remains to be seen - but it suggests an interesting ability to (for example) allow a phone conversation that humans can understand, but cannot be automatically transcribed by anyone intercepting it. Thats unlikely to be useful for secure conversations because its probably easier to just end-to-end encrypt the voice channel, but it may stop your phone provider using your speech to sell you a new upgrade (or your government sending you for re-education). I can forsee it being a feature offered by privacy-focussed communication apps.
I think current AI transcribers are pretty good at listening through white noise. Not as good as humans, obviously, but I suspect that in order to defeat an AI transcriber through white noise alone, you'd have to deploy a volume that would be annoying to humans.
Indeed, this is an AI model that can defeat AI models trying to listen to you. The escalation of this naturally results in an AI model eventually figuring out the only way it can win is by eliminating humans.
A long-gone solicitor relative of mine told a group of us kids that the way to defeat eavesdroppers is with anything that rustles. To demonstrate he had one of us sit opposite him at a small table, and the rest of us were at the other end of the room. He just casually rustled a newspaper, while speaking in an even quiet voice. We couldn't work out anything, except just the occasional random word.
is fucking amazing at picking data from noise.
There's an oft-cited experiment where someone reads a corrupted text to another person who corrects it in real time and when I say real time, I mean <30ms lag.
Alternatively there's another experiment which uses TDM to reduce the data in a flow of speech to c. 10% and it's still intelligible
Human speech has evolved over millions of years and has helped us not need power, speed or strength when it comes to bodies.
I remember a demo in uni where the prof started with 8 bit 8kHz sampled speech and sequentially removed bits from the input. Though noisy, the 1-bit version was still mostly intelligible.
Later, I worked on a voice messaging application for my employer, and we repeated the experiment with similar results. Turns out, the human speech recognition wetware looks mainly at zero crossings.
Also Adaptive Delta PCM. That has been around for a long time (for compressed voice channels - the actual input is compared to the predicted input and only the difference is sent).
So predict the output (in real time, as we did with analog electronics some, what, 40 years ago) and modify it slightly.
On a slightly different note, Shannon came up with a method to calculate the entropy of the English language. Might be another useful concept to use...
I would expect it works by adding noise the speech recognition software isn't expecting, but that doesn't mean it can't be modified to ignore that noise. If it doesn't impair conversations between people, how can it be a permanent roadblock to speech recognition software?
If this is widely adopted it'll probably stop working because the snoops will modify their software to keep snooping. Alexa et al will view any speech it can't understand as a bug and send a copy of it back to home base for analysis so using this might make it MORE likely you're snooped upon.
Comment More than 250 mass shootings have occurred in the US so far this year, and AI advocates think they have the solution. Not gun control, but better tech, unsurprisingly.
Machine-learning biz Kogniz announced on Tuesday it was adding a ready-to-deploy gun detection model to its computer-vision platform. The system, we're told, can detect guns seen by security cameras and send notifications to those at risk, notifying police, locking down buildings, and performing other security tasks.
In addition to spotting firearms, Kogniz uses its other computer-vision modules to notice unusual behavior, such as children sprinting down hallways or someone climbing in through a window, which could indicate an active shooter.
In brief US hardware startup Cerebras claims to have trained the largest AI model on a single device powered by the world's largest Wafer Scale Engine 2 chip the size of a plate.
"Using the Cerebras Software Platform (CSoft), our customers can easily train state-of-the-art GPT language models (such as GPT-3 and GPT-J) with up to 20 billion parameters on a single CS-2 system," the company claimed this week. "Running on a single CS-2, these models take minutes to set up and users can quickly move between models with just a few keystrokes."
The CS-2 packs a whopping 850,000 cores, and has 40GB of on-chip memory capable of reaching 20 PB/sec memory bandwidth. The specs on other types of AI accelerators and GPUs pale in comparison, meaning machine learning engineers have to train huge AI models with billions of parameters across more servers.
Microsoft has pledged to clamp down on access to AI tools designed to predict emotions, gender, and age from images, and will restrict the usage of its facial recognition and generative audio models in Azure.
The Windows giant made the promise on Tuesday while also sharing its so-called Responsible AI Standard, a document [PDF] in which the US corporation vowed to minimize any harm inflicted by its machine-learning software. This pledge included assurances that the biz will assess the impact of its technologies, document models' data and capabilities, and enforce stricter use guidelines.
This is needed because – and let's just check the notes here – there are apparently not enough laws yet regulating machine-learning technology use. Thus, in the absence of this legislation, Microsoft will just have to force itself to do the right thing.
In Brief No, AI chatbots are not sentient.
Just as soon as the story on a Google engineer, who blew the whistle on what he claimed was a sentient language model, went viral, multiple publications stepped in to say he's wrong.
The debate on whether the company's LaMDA chatbot is conscious or has a soul or not isn't a very good one, just because it's too easy to shut down the side that believes it does. Like most large language models, LaMDA has billions of parameters and was trained on text scraped from the internet. The model learns the relationships between words, and which ones are more likely to appear next to each other.
In the latest episode of Black Mirror, a vast megacorp sells AI software that learns to mimic the voice of a deceased woman whose husband sits weeping over a smart speaker, listening to her dulcet tones.
Only joking – it's Amazon, and this is real life. The experimental feature of the company's virtual assistant, Alexa, was announced at an Amazon conference in Las Vegas on Wednesday.
Rohit Prasad, head scientist for Alexa AI, described the tech as a means to build trust between human and machine, enabling Alexa to "make the memories last" when "so many of us have lost someone we love" during the pandemic.
Opinion The Turing test is about us, not the bots, and it has failed.
Fans of the slow burn mainstream media U-turn had a treat last week.
On Saturday, the news broke that Blake Lemoine, a Google engineer charged with monitoring a chatbot called LaMDA for nastiness, had been put on paid leave for revealing confidential information.
Google has placed one of its software engineers on paid administrative leave for violating the company's confidentiality policies.
Since 2021, Blake Lemoine, 41, had been tasked with talking to LaMDA, or Language Model for Dialogue Applications, as part of his job on Google's Responsible AI team, looking for whether the bot used discriminatory or hate speech.
LaMDA is "built by fine-tuning a family of Transformer-based neural language models specialized for dialog, with up to 137 billion model parameters, and teaching the models to leverage external knowledge sources," according to Google.
GPUs are a powerful tool for machine-learning workloads, though they’re not necessarily the right tool for every AI job, according to Michael Bronstein, Twitter’s head of graph learning research.
His team recently showed Graphcore’s AI hardware offered an “order of magnitude speedup when comparing a single IPU processor to an Nvidia A100 GPU,” in temporal graph network (TGN) models.
“The choice of hardware for implementing Graph ML models is a crucial, yet often overlooked problem,” reads a joint article penned by Bronstein with Emanuele Rossi, an ML researcher at Twitter, and Daniel Justus, a researcher at Graphcore.
As compelling as the leading large-scale language models may be, the fact remains that only the largest companies have the resources to actually deploy and train them at meaningful scale.
For enterprises eager to leverage AI to a competitive advantage, a cheaper, pared-down alternative may be a better fit, especially if it can be tuned to particular industries or domains.
That’s where an emerging set of AI startups hoping to carve out a niche: by building sparse, tailored models that, maybe not as powerful as GPT-3, are good enough for enterprise use cases and run on hardware that ditches expensive high-bandwidth memory (HBM) for commodity DDR.
AI is killing the planet. Wait, no – it's going to save it. According to Hewlett Packard Enterprise VP of AI and HPC Evan Sparks and professor of machine learning Ameet Talwalkar from Carnegie Mellon University, it's not entirely clear just what AI might do for – or to – our home planet.
Speaking at the SixFive Summit this week, the duo discussed one of the more controversial challenges facing AI/ML: the technology's impact on the climate.
"What we've seen over the last few years is that really computationally demanding machine learning technology has become increasingly prominent in the industry," Sparks said. "This has resulted in increasing concerns about the associated rise in energy usage and correlated – not always cleanly – concerns about carbon emissions and carbon footprint of these workloads."
Biting the hand that feeds IT © 1998–2022