back to article Lend me your ears and AI will play with your brain: Machine voice imitators outsmart us

Human brains appear to be better at detecting fake images than fake voices, a distinction that bodes ill for security as voice spoofing technology becomes more effective and more accessible. In a blog post this week, Adrian Colyer, venture partner with Accel in London, explored a paper presented earlier this year at 2019 …

  1. Olivier2553

    Need more knowledge how the black box works

    That's something that bugs me with all the recent AI solutions based on neural networks: there never seems to be any effort to understand why something works the way it does.

    AI is good to distinguish between real and impersonator. Statistic says so. But what traits are detected in the voice signal that contribute to that decision?

    The researchers are supposed to know what is inside their NN and maybe they need to focus more on how the various parameters get adjusted that lead to the AI black box.

    1. AMBxx Silver badge
      Big Brother

      Re: Need more knowledge how the black box works

      I did some work as a trial on recruitment data to work out which candidates would leave within the first 6 months.

      Using a neural network, I was able to predict 50% of early leavers if I tweaked for no false positives. Much higher if I allowed a few false positives. This would lead to a massive reduction in recruitment costs for companies with high staff turnover (ignoring the legalities of the data!).

      Ran a slightly different model using more formal Regression - could find no pattern at all.

      As you say, I had no idea what was going on in the black box. Nor do I know if I could trust it.

      1. JetSetJim
        Mushroom

        Re: Need more knowledge how the black box works

        > I had no idea what was going on in the black box. Nor do I know if I could trust it

        The crux of the problem with AI/NN. There are a few tools around that attempt to give insight into this (IIRC, there was one that illustrated what features of an image a NN was using to classify cars moderately recently), but even these aren't the best indicators and I didn't really gain a sense that the indicators were meaningful - e.g. there was a wolf / dog classifier trained on a large number of images, but it turned out it was learning on the fact that all the wolf images in the training set had a snowy background.

        It's the inability of the NN to tell you what it is doing, plus the high risk of innocuous biases in the training data that will lead to many failures in NN deployments, and I would not trust it to do anything remotely safety-critical until this is addressed.

        Saying that, there are probably many cases of the brain behaving in exactly the same manner - in how many cases for this experiment (or any other human vs AI experiment for that matter) could the individual enumerate precise reasons why they classified a voice as fake/real? Sometimes it's possible, for sure, but other times it's more arm-waving

  2. Teiwaz

    So the sounds, but what about the content.

    Provided the speaker is known to the listener (and the listener pays attention more than a only a minimal amount of time)

    A red flag has to go up with the listener if the kind of language used is not how the speaker normally communicates?

    1. Keven E

      Re: So the sounds, but what about the content.

      In 1981 or so... just before CD's came out to the general public, a dozen or so of us music students were dragged down to Universal Studios (famous in Chitown) and brought into two engineers booths. In the first one they said to "listen to this". We then moved into the booth next door and they said "now listen to this". We all agreed the first one was better. Much better.

      It turned out the first one was analogue recording, the second one was the same recording converted to the new CD format. Granted we were all *seasoned listeners, but that is the point. Since digital recording (albeit the early ones were the worst) there are pieces of the spectrum that just aren't heard and therefore not really expected or trained to be heard by anyone in the general public, and haven't now for the most part of the last 40 years. Generally its the existence, not as primary, but clearly some flavor, subtlety adding overtones and timbre missing which today nobody is aware of its lacking.

      The voice is a quite complex sound and the nuances of language, timing and timbre is, as far as I'm concerned, best left for humans to experience with others.

      Saying "machine voices outsmart us" is as relevant as "is it live or is it Memorex?".

  3. druck Silver badge
    Holmes

    Paintings

    I'm puzzled about the previous study looking at real and fake Rembrandt paintings, where they using paintings in the same style but not a real Rembrandt or a completely different style of painting? Where they using ordinary people or trained art historians, who would actually know the difference. It could be argued that paintings are equivalent to fake photographs, and this is what is giving the changes in brain patterns.

  4. Anonymous Coward
    Anonymous Coward

    This isn't a new thing with AI

    Human voice impersonators have been able to fool other humans since long before computers even existed. If Nixon had been as unabashed a liar as Trump, he would have claimed the Watergate tapes were a forgery done by Rich Little.

  5. Anonymous Coward
    Anonymous Coward

    People were able to tell the difference between real and fake, but the researchers couldn't find any measurable difference in the neural activity.

    It sounds like the researchers need better ways of measuring neural activity.

  6. Andy 97

    This is exciting stuff, but how long before people start asking Blade Runner-esq questions at the start of calls?

    “A turtle was flipped over, it’s baking in the sun. You spot it, but don’t help, why is that?”

    1. Simon Harris
      Terminator

      "What's wrong with Wolfie? I can hear him barking"

    2. Anonymous Coward
      Anonymous Coward

      “A turtle was flipped over, it’s baking in the sun. You spot it, but don’t help, why is that?”

      just use the reverse starfish gambit .... "because if I turned it back over then someone would just say 'ok, you saved that one, bt what about all the other turtles that are flipped over, you're not exactly helpong them are you'"

    3. Teiwaz

      This is exciting stuff, but how long before people start asking Blade Runner-esq questions at the start of calls?

      Should have probably started that decades ago to weed out the worst type of predatory phone salesperson.

  7. spold Silver badge

    Opens letterbox....

    UNLOCK THE FRONT DOOR ALEXA!

  8. Michael Wojcik Silver badge

    Colyer's blog

    Nice to see the Reg mentioning Adrian Colyer's blog (the morning paper). I recommend it for those interested in following contemporary CS research. Colyer doesn't shy away from the technical details of the papers he discusses, but he does a nice job of summarizing them in a way that's accessible to non-experts. (Non-experts in that particular field, that is. You'll want some CS background for most of them.)

  9. wwwhatsup
    Go

    VIDEO OF THIS AND SIMILAR PRESENTATIONS AT NDSS

    https://www.youtube.com/playlist?list=PLfUWWM-POgQvk44ZNBQ9v4FcQ0VtaZ9Yk

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like