Alexa, purchase a Bit Coin
is the new stealth hit song. Wish I knew how to set it to repeat.
Computer science boffins affiliated with IBM and universities in China and the United States have devised a way to issue covert commands to voice-based AI software – like Apple Siri, Amazon Alexa, Google Assistant and Microsoft Cortana – by encoding them in popular songs. They refer to these tweaked tunes, which issue mostly …
so you have a device listening in all the time and you are surprised when it hears things? oh my. next people will be using photographs to fool facial recognition locks...
on a second note, we try to set my mates alexa off all the time if he has discord running on speaker. sometimes he does forget to swith alexa off.
The worst part is most of these voice recognition devices are always on & listening even when set to "off". We have a smart thermostat whose settings are set with voice commands "off", yet about once a month it just belts out a "I didn't understand that". When I check to see if it's voice command feature got turned on by someone, the settings always show it "off". Be afraid, very afraid.
Or better yet, get rid of the thing.
My phone used to have an occasional habit of suddenly saying "Sorry, I dod not understand that question" ... normally during the R4 Today program. Turned out that Android skips the "Ok Google" step and is always listening when beign chargd unless a setting to stop this is set. Hence, while chnarging on my bedside table beside my radio it had been spending several weeks/months listenning to John Humprhys et al!
Kudos to the Register for getting 90% of a story across in very few paragraphs.
I would not normally grinch over the sentence “a microphone capable of suppressing ultrasound” but it leads to a couple of observations:
1) Microphones can be designed to suppress ultrasound, but that either makes them bulky or expensive. More likely the suppression is happening in the amplifier stage.
Interestingly, even this is an expense that scale-producers avoid as we found to our astonishment when we ran a hack-day around nature exploration: trying to use off the shelf gadgets to listen to bats. We found that out of several laptops and mobile phones about half had no hardware limitations limiting the frequency range to the human-audible spectrum! Not surprising, given that hugely oversampling and filtering in the digital domain is working fine, but I digress. Anyway, the observation was that the cheaper the laptop of phone the easier it was to hack it for bat-listening.
So, for money reasons alone, how much would you like to bet that even in three years’ time plenty of connected microphones will still be responding to ultrasound input?
2) Ultrasound is actually wanted by gadgets: digital fingerprinting of music, advertisements etc. has been reported aplenty recently, even in the Register iirc. Another reason to doubt that ultrasound will be ignored by connected mics.
More likely that gadgets will increase their listening range (“to better listen out for your protection” says grandma and we all feel comforted) and countermeasures will have to evolve in the signal processing. Where the success-rate will be unknown, and thus it ever bumbles on... :)
If you can make the speakers play ultrasound then you should set it to play in the autumn when nonfleider maus tend to come inside the house. They don't like ultrasound noise. I recall a research mouse facility whose mice were not breeding, turned out their aircon system was screaming in the ultrasonic. They fixed it and they bred like, small rabbits.
Random Biology: male mice sing like canaries in the ultrasonic to attract the ladiez and also scream in the ultrasonic as they climax. So you can imagine how ultrasonic noise spoils the vibe.
"Microphones can be designed to suppress ultrasound, but that either makes them bulky or expensive"
In many cases just using a heavier diaphragm in the microphone capsule would do the job (extra material cost close to zero). I suspect it's more a case of apathy than expense.
Until voice recognition can reliably identify individuals, it would be cool if you could name each device locally.
I.e. if I say “Alexa” or “Siri” it only gives me very basic guest type access.
However I’d I say “AmazonMcSpyFace” instead it sort of salts my following voice command, giving me higher privaledges.
Disclaimer: I don’t own any of these, they possibly already allow this.
These devices have a low power chip that listens continuously for anything resembling "Alexa" or "OK Google". When it hears something that matches it sends a recording to the cloud for speech recognition. Putting proper speech recognition into a low power chip would be difficult.
The strange thing is that early attempts at speech recognition (what you say) turned out to be voice recognition (who is speaking) devices. The down side is that antique tech requires training. Say "Siri recognise my voice" a hundred times and a low power chip probably could (but it would also respond to you saying "OK Google" or "OW! Who spread drawing pins on the floor?"). The problem is to find customers with enough brains to understand the problem, enough patience to actually train the device and sufficient courage/gullibility to let such a device in their home.
I don't care about South Park. I care about an event that gets seen by millions of Americans throughout the country. I just picked it for being one of the most-watched events in the US. In Australia, it would probably be the AFL Grand Final, or anything else of similar caliber.
So, to summarise: they tuned audio to be recognized by a specific analysis engine, and then tested it by having that specific engine recognize it.
And this won't work on any existing products until they reverse-engineer their recognition. That could be difficult, since it's based on machine learning and is likely to be obscure.
"And this won't work on any existing products until they reverse-engineer their recognition."
Not true at all. It's easy to create adversarial images by treating the recognition system as a black box, and there's no reason sound would be any different. If anything, it actually makes things much easier - you don't need to know anything about how it works, you just try over and over again making small changes to the input until you get the output you want. No understanding or thinking required at all, which ironically makes machine learning the perfect tool to screw with systems that rely on machine learning - if you use machine learning for a recognition system, I can use machine learning to learn how to break it, without ever needing to know what your machine learning system has actually learned.
This is equivalent to saying "It doesn't matter that your password is long, I can just keep trying and trying until I find a pattern that fools you!"
Technically correct, but we all know why that won't actually work.
This is an interesting theoretical attack, but the fact they haven't extended their open box, clean room, in the lab theory work into a practical proof-of-concept speaks volumes.
"This is equivalent to saying "It doesn't matter that your password is long, I can just keep trying and trying until I find a pattern that fools you!"
Technically correct, but we all know why that won't actually work."
No we don't. Mainly because it absolutely does work. That's exactly what is meant when we say, for example, that MD5 is broken because of the possibility of collision attacks. It's literally the exact same thing - an attacker tries lots of different inputs until they find one that happens to give the desired output. The only real difference is that with cryptographic functions that's a big problem that we try to avoid, while with machine learning systems it's a design feature; the whole point of image recognition is to feed in lots of different pictures and get a limited set of outputs - dog, cat, car, etc. - so attacking it is just a matter of making small changes until the output switches from one to another.
Basically, both systems simply convert an input to an output. Cryptographic functions would ideally be one-to-one, but in practice are always many-to-one and therefore at least theoretically attackable; they simply rely on making such attacks unfeasible given the current level of technology. Machine learning systems are many-to-one by design, and are therefore inherently vulnerable.
This post has been deleted by its author
We need a checkbox response for "security hacks".
This one would be "You construct your own (flawed) system that you claim is similar to the original, and hack that, never proving the hack works on the readily-available original."
Neural networks are inherently untrustworthy. It's trivial to train a bad one that superficially appears to work. There are many stories of NNs that were later found to be deeply flawed. One was a tank / APC image recognition that was dramatically 'better' than humans, spotting tanks that were expertly concealed. It turned out that it was classifying road ruts as positives, not armored vehicles.
There are now tools that help visualize intermediate node responses on specific types of Tensorflow networks. But that's a tiny fraction of the systems, you need to be an expert to understand what you are seeing, and it only works for images. It's actually more directly useful for figuring out the system is flawed than improving it (although one can lead to the other). But note that it requires access to the intermediate nodes -- which isn't known to the end user with the cloud processing of Alexa and the like.
There was a time in Blighty where HMG had to ban subliminal messaging in cinema and broadcast media.
All an of our ad execs would need to do is to hide some nonsense about product/service at the end of the ad and hey presto, your slimline plastic friend will ask you if you want to know more.
Ofcom has better get their regulation head on...
a technological reason to tell the youth to "turn that sh*te down!"
I wonder how long until "pirated" audio or video tracks start "phoning home"?
Phishing with music files "traded" amongst associates?
with some of the craptacular techno modified stuff out there, would one ever know till it was too late?
It's only a matter of time before there are web sites where you can select an audio file, type in your desired command, select a target device type, and then download your custom attack. Then the real fun begins.
Alexa, set the temperature to 40 C
OK Google, open the garage door. OK Google, close the garage door.
Siri, show all the pics in the folder named private on the TV
Etc.