"... participants were paid three cents per recording"
The end results were not worth the money.
Machine learning researchers are on a mission to make machines understand speech directly from audio input, like humans do. At the Neural Information Processing Systems conference this week, researchers from Massachusetts Institute of Technology (MIT) demonstrated a new way to train computers to recognise speech without …
"The goal of this work is to try to get the machine to learn language more like the way humans do," said Glass.
Is that how people do it? I have aphantasia (my "mind's eye" is blind), and I had no idea visualisation was involved in most people's understanding of speech.
I just convert straight to text.
Interesting. I was wondering about this the other day. I was reading about subvocalisation, which is where people mentally form words when reading. When learning foreign languages, this can be a necessary step, but if you get into the habit of hearing each single word in your head as you read it, it means that your reading speed is limited to how fast you can vocalise it (so reading speed = speaking speed, effectively).
Anyway, that got me thinking about how people who are deaf from birth process written material. I suppose that's a variation on this "antiphasia" you mentioned, though I still wonder can people who were born deaf still have mind's-eye style auditory hallucinations even absent the signals needed to prime it? Is it possible that the brain uses other sense data (such as muscle memory of tongue position, mouth shape and so on, as gained from speech practice) as a proxy for subvocalisation?
"Interesting. I was wondering about this the other day. I was reading about subvocalisation, which is where people mentally form words when reading. When learning foreign languages, this can be a necessary step, but if you get into the habit of hearing each single word in your head as you read it, it means that your reading speed is limited to how fast you can vocalise it (so reading speed = speaking speed, effectively)."
Well that explains my reading speed. Thank you.
As I put it, I can't visualize my way out of a wet paper bag. What's strange is that I don't recognize words by the individual letters but as a gestalt (a unique symbol all its own). That comes in handy in that maths of any type are processed as sentences with their own symbol sets. Oh and pictographic languages are simpler. Still, knowing where I put something.is a chained set of vectors. Thanks for the name of the condition. I just thought I was one really weird autistic.
"I just thought I was one really weird autistic" --- Jack of Shadows
I heard about it through this BBC article, which contains a short test. I am boringly average, of course.
@John Woods
Thanks for the link. I also got a boring score - sightly on the low side. But I found the questions difficult to answer. And as the test progressed, I started to wonder whether what I was imagining was really an 'image'.
The article has a quote from one person:
"When I think about my fiancee there is no image, but I am definitely thinking about her, I know today she has her hair up at the back, she's brunette.But I'm not describing an image I am looking at, I'm remembering features about her, that's the strangest thing and maybe that is a source of some regret."
If he's not describing an image, what is it he's remembering? I'm now confused, but very fascinated.
There are 7,000 languages, and I think less than 2 percent have ASR [automatic speech recognition] capability, and probably nothing is going to be done to address the others.
Very probably, since about 6000 of those are expected to have no native speakers left by the end of this century. However the 140 languages where ASR work is being done probably covers the primary and secondary languages of 90% of humanity today.
"The Babel fish is small, yellow, leech-like, and probably the oddest thing in the Universe. It feeds on brainwave energy received not from its own carrier, but from those around it. It absorbs all unconscious mental frequencies from this brainwave energy to nourish itself with. It then excretes into the mind of its carrier a telepathic matrix formed by combining the conscious thought frequencies with nerve signals picked up from the speech centres of the brain which has supplied them. The practical upshot of all this is that if you stick a Babel fish in your ear you can instantly understand anything said to you in any form of language. The speech patterns you actually hear decode the brainwave matrix which has been fed into your mind by your Babel fish. [...]
"Meanwhile, the poor Babel fish, by effectively removing all barriers to communication between different races and cultures, has caused more and bloodier wars than anything else in the history of creation."
-- DNA, H2G2
We've had the Echo in the US for some time, enough time to get used to Alexa. Unlike the picture you don't communicate with Alexa by grabbing a handheld and yelling at it, you talk in a conversational tone, albeit in the imperative. She's pretty good at picking out requests even in a noisy environment. Her speech is a bit like the machine's from the movie "Her" -- you don't really think of it as a machine so you find yourself inserting random "please" and "thank you" phrases into the conversation. (You only notice the machine when you get her to read from a book -- the tone's a bit flat, as if she was "on the spectrum".)
We've now got to the point where most sci-fi movies, even relatively recent ones, look horribly dated. Alexa's more than an interface; she learns and can be uncanny in the way she selects music and the like. She's still a machine, though, with a relatively simple backend so I can't wait to see what gets served up in even the near future as things develop.
Is this actually cheap image recognition or are they paying over the odds?
And it's incomplete, I see a fence and a pavement in the first picture but neither seems to be mentioned. And did 'sidle' go out of fashion or can I just not see the crab that is doing that 'sidewalk' thing that they do?
The 'tar everywhere' got cleared up, right? That can get expensive when you have cars sat in it, at least none looked like it got on their paintwork.
+/- that 'divided by a common language' quote.
I think this goal is strongly linked to full general AI...
It's because of the context sensitivity in our understanding of human language. One word can mean one thing in one context, and another thing in another context. Words vary in meaning, with everything from slight variation to extreme variation, depending upon the context which they're uttered in.
Because of this, in order for a computer to understand language fully, like a human being, the computer will also have to understand the world fully (or, at least as fully as a normally intelligent human being would understand it today).
It means that this goal cannot possibly be reached fully until we've also fully simulated the whole general intelligence of human beings, in all it's breadth and in all it's depth...