back to article Lost voices, ignored words: Apple's speech recognition needs urgent reform

As someone who relies on Apple's Voice Control application to dictate, navigate, and interact with my iPhone and Mac via my voice due to a severe physical disability, I can't help but feel both grateful for its existence and frustrated by its shortcomings. Apple has made commendable progress this year with accessibility …

  1. Pascal Monett Silver badge

    Interesting column

    I'm happy that Dragon still has a life somewhere, and that people who really need it can still use it. I've always heard good things about Dragon. Nice to see the fire is still alive.

    That said, there are some issues which, I think, could be easily solved. The proper noun issue would go away if, when recording a new one, you had the option to Always Capitalize. That should be rather simple to implement.

    The issue with Will, or The Sun, is simple to comprehend : the product is not aware of the contect in which it is working. It "hears" something, and code happily goes to the nearest match and, bingo, you've got Will, a person, when you're talking about your will to do something. Maybe, before using the match, the product could detect that more than one possibility exists, and popup its proposal to which you could just say no ? I have no idea how practical that would be, and it would certainly complicate things in all the other cases that work fine.

    But before proposing The Sun, the product could have a basic notion of how many times the user has already wrote about newspapers. If it's never, then maybe don't do that automatically ?

    Or better : allow the user to invalidate a match. No, I never talk about The Sun, stop using that match.

    That shouldn't be too difficult to implement, should it ?

    1. big_D Silver badge

      Re: Interesting column

      But with companies championing AI, and Apple putting neural cores in their iPhone and Apple Silicon Macs, it is a poor show that they still can't work out the context of a sentence and decide whether a pronoun or a verb is needed. It is even worse in German, where every verb is also a noun and all nouns start with capital letters, even their keyboard on the iPhone and iPad gets confused with this all the time.

      Having the option to mark words as pronouns would be useful - although I suspect, given the sinking levels of comprehension, that might confuse some users.

      I know, when I learnt German, I had to go back and relearn some English concepts I took for granted, I knew what nouns, pronouns, adjectives and adverbs were and could use them without thinking, but actually thinking about them, when trying to apply it to a foreign language, made me realise how much I don't need to think when speaking or reading English. And verbs, present, future i/ii, conjunctive, past, past perfect, future i/ii conjunctive, future progressive and so on. Then subject, accusative, dative, genetive, singular, plural (and then in German masculine, feminine or neutum).

  2. hitmouse

    Microsoft's English recognition is pretty good, but it has made little effort outside of US vocal dialects and spelling. If you switch to British or Australian (for example) in Teams then you have a double issue with reco being poorer and ludicrous homophones being used in transcript e.g. someone thinks that "cheques" is uniformly used where "checks" is uttered. See also "draughts", "philtres" and a number of other default renderings even when grammatically incorrect.

    1. Anonymous Coward
      Anonymous Coward

      You have got to be careful using English recognition, don't say "I would like to help my friend skill the president" ... you would have to tell the police it's a software recognition issue when you get arrested (hopefully just arrested and not shot).

    2. GruntyMcPugh Silver badge

      We've had a couple of funnies with Teams telephony sending transcripts for missed calls. One user reported a potentially abusive message to us from 'Mistress Spanky', when the message was from Mrs Stransky. and another from 'ADP Foreign Security' which turned out to be 'ADT Fire and Security'. So we've now got a small worry that transcripts of meetings contain howlers, and people in meetings might be seen to be agreeing with them, and whether this could be a subject access request / FOIA embarrassment.

  3. Phones Sheridan Silver badge

    Accessibility has never been an Apple priority. This has been covered to death in disabled user support forums for various products, basically for ever. Apple products are perfect products for perfect people. Disabled people just make the brand look ugly, accessibility concessions dilutes the perfect UI experience they have so carefully crafted. Apple genuinely want you to go elsewhere.

    Your existing is wrong!

    1. Anonymous Coward
      Anonymous Coward

      Sad to say it's not just Apple who can be like that, but I see where you're coming from. Apple UI designer Peter Bickford's book got a bad review at for exactly this, and then Apple seemed to change with the release of Mac OS X, but they have been letting things slide more recently.

    2. ethindp

      This is pretty much the problem with apps and companies everywhere. Accessibility is always an afterthought. Rarely have I seen a company think of it and incorporate it during the design process, and then continue developing it as the product(s) evolve. And whenever we try to pass laws to get these companies to give a damn, the companies manage to get the laws stalled. I'd be all for just ramming laws through to improve this without giving companies any time to lobby or stall anything.

      Accessibility (needs) to be, like, something that is over-emphasized in UI design courses, books, etc. Dedicate like half the course/book to it. It isn't something that you can just implement afterwards (trust me, I've tried it, and I'm blind, and it's far harder than you think, and it never feels entirely complete because, at the end of the day, it's usually just a huge number of hacks); it needs to be thought about and deeply incorporated with the rest of the apps design and functionality before development even begins.

    3. Stuart Castle Silver badge

      I'm no expert, so apologies if I have this wrong, but a few years back (and I am talking the early 200s), Apple did have very good accessibility options. Compared to the opposition.

      The problem is that the competition, in this case, Microsoft has come on leaps and bounds and while, based on the accounts of some of our disabled users, the features in in Windows aren't good, they are generally adequate. Beyond the introduction of Voice Control, and lumping a whole load of permissions restrictions under the subject of "Accessibility", Apple haven't really changed the accessibility options of macOS that much.. They certainly have not improved them.

      1. Orv Silver badge

        It depends on your point of comparison. Their accessibility options for macOS are worse than the third-party stuff that's available for Windows. On the other hand, I gather iOS accessibility is quite a bit better than Android.

    4. Anonymous Coward
      Anonymous Coward

      @Phones Sheridan

      "Disabled people just make the brand look ugly, accessibility concessions dilutes the perfect UI experience they have so carefully crafted. Apple genuinely want you to go elsewhere."

      `So why the whine? Just go elsewhere. Problem sorted?

      1. Phones Sheridan Silver badge

        So you're agreeing with my assessment of Apple and you're telling the author of the article to stop whining and go elsewhere?

        1. Anonymous Coward
          Anonymous Coward

          @Phone Sheridan

          Ok I will make this easy for you to understand.

          If I had aimed my reply to the author, I would not have replied to you. You are not that important. Ok?

          I totally disagree with your "assessment" of Apple. It's bullshit. Do you seriously think that ANY company would ignore a market segment where they could make money? Seriously?

          It was this that I was pissed off with.

          "Your existing is wrong!"

          A self piteous play on the "you're holding it wrong" thing that has been played to death for donkeys years.

          But you knew that didn't you. You just wanted to play the "better than you" virtue signalling thing didn't you?

          1. find users who cut cat tail

            > Do you seriously think that ANY company would ignore a market segment where they could make money?

            Could they? Companies ignore market segments all the time – precisely because [they think] it is not worth it. Sometimes they are right, sometimes not.

            Accessibility is hard and tends to or force you to change the design, i.e. get in the way. And Apple are not in the ‘everyone uses this’ business. They are ‘overpriced tat to show your superiority’ business. If everyone used Apple, Apple users would no longer be special. So to keep the air of exclusivity they actually need the unwashed masses to use something else.

            > Your existing is wrong!

            Did anyone say that? Please calm down.

          2. Phones Sheridan Silver badge

            "You are not that important. Ok?" and yet you've replied to me... twice! You appear to be one of these people who for example creates an account on a forum, to post a reply along the lines of "Does anyone care!". Everyone else is left wondering why you took the time to post a non-post about something you don't care about.

            "Do you seriously think that ANY company would.... yada yada yada". In case you're not aware Apple's UI designer wrote a book "Interface Design: The Art of Developing Easy-to-Use Software - Peter Bickford". In it he explains his 80/20 rule, that I can summarise with the words, "don't cater to minorities", and it was this opinion that got his book slated on release by disabled programmers and users, the very people arguable who most need the "Easy-to-Use Software" he discusses. So do I seriously think? absolutely yes!

  4. Anonymous Coward
    Anonymous Coward

    Android is not much better I'm afraid

    I'm dictating this on my Android phone right now. It's not so bad when I'm in quiet surroundings and speak carefully. (But) (i)n general I do have to make more corrections than I('d) like. I'm adding the corrections in parentheses so you can see what I mean, although(,) as usual with demonstrations(,) it's not going as wrong as it usually does when other people are watching and I'm trying to demonstrate it going wrong :)

  5. chuckufarley Silver badge

    The root cause...

    ...isn't much of a mystery to me. Apple wants to sell to the richest 25% of the market. People with sever disabilities are not normally in that market segment. If fact they are regularly found in the bottom 25% of the market. Therefor implementing features to make iPhones and Macs easier for them to use isn't going to lead to anymore digits on the bottom line. If fact, it might even cost Apple more to write and test it than they ever would make from the sales it generated.

  6. smot

    Eat Up Martha

    For Simpson fans.

  7. Oh Matron!


    Apple, in my eyes, has always been great with accessibility features.

    However, I also use Siri to dictate on my iPhone as I'd rather keep my phone in my pocket rather than advertise to the scroats of london that I have an iPhone

    However, iOS 16 has been absolute Garbage compared to iOS of old.... It *may* be that I use my Airpods pro rather than my older powerbeats pro to dictate, not sure...

    Even having Siri read stuff back is hideously broken: It can't differentiate between

    "I live here" and "Now, live from Norwich..."

    This is a choice by Apple to be this bad. If their goggles want to be successful, eye and hand gestures can't be relied upon solely: Voice has to be right up there too.

    1. anthonyhegedus Silver badge

      Re: Shame.....

      I just tested Siri reading stuff back and it gets "I live here. Now, live from Norwich, a man who..." and it reads it correctly. This in iOS 17 beta. This gives us home. I'l going to try a few more when I can think of them...

    2. David 132 Silver badge
      Thumb Up

      Re: Shame.....

      I have noticed that in the more recent versions of iOS, Siri handles dictated words it can't recognize by simply ignoring them, whereas before it would at least have a stab at transcribing them. Is it better, or worse, to send texts that say (for example):

      "I will be there, fuel" (current behaviour, with multiple words omitted), or

      "I will be there short Lee half two git fuel" (previously)

      Personally I preferred the earlier behaviour; while frustrating and comical in equal measure, it was usually at least possible to figure out what the sender was trying to say.

      As others have pointed out, Siri is also completely context-unaware; if I am passing the town of "Jonesberg", for example (name tweaked slightly for purposes of example), and have typed that name in multiple texts over many, many months, why does Siri insist on transcribing it as "Jonesburgh"? Which isn't even a similar pronunciation?


  8. DJWalker

    Barking up the wrong tree with accessibility

    Apple has had quite good accessibility features since before OSX. I don't think this was altruistic per se, as a fringe OS, I think they needed to have these features as a way of differentiating from Windows. Voice recognition on macOS/iOS is not bad because of lack of interest in accessibility, it is just bad. Voice recognition is really bad and it sucks beyond what you see in Google or even Microsoft. I have never seen a system which has such poor overall voice recognition response. So why is it bad? I think part of the issue may be that Apple has taken such a hard line on not collecting personal information and uses training models that are based on random crowd response data (mostly American). Now I recognize there have been some articles (even in the Register) about Apple having collected, surreptitiously, data regarding user vocal response. However, I don't think it's pervasive. I think they have been trying to engineer voice recognition technology without using personal data (or as little as possible). I've based this view on my experience over the years. There's been a number of instances (as an example) where I've dictated to Siri an email wtith a name provided as the recipient line like "Jo-anne" (or other non-common spelling) ends up as interpreted as "Joanne". The more common interpretation is chosen, yet in the recipient line and in the contact database stored on my system, the actual recipient name is stored (with the less conventional spelling ). In fact, I used to use Dictation all the time and it's getting worse and worse. One wonders if Siri is using the contextual information like contacts like I've stored or is just really bad. But this happens on numerous occasions with all kinds of contextual data. It seems more likely that the AI is simply trying to interpret, based on exclusively vocal input in the moment and in real time, what you are saying. I don't think that Siri is really training itself on what you're saying (even thought the EULA says it might), but on what (on average) it gets from whatever Apple feeds it. The AI recognizes input based on whatever training data Apple is using. That training data seems to be sh*t (or I think it takes most of that training data from average Americans). I have seen it getting worse over time, not better, because on average Americans are exceptionally dumb. That is not a subjective assessment, numerous surveys on basic knowledge show America behind the curve. Despite my attempts to use dictation functions as a Canadian, it is getting worse. When I use the word "axe" in a sentence (just refer to the Monty Python folks about Canadians, lumberjacks and our love of chopping hardware), it gets converted into "ask". Apparently we are all African Americans. Any polysyllabic word gets converted into nonsense, and if you don't excessively drool or spit while talking, then good luck using Siri. The problem with Apple and accessibility with dictation is that all Americans are sufficiently disabled that Siri is too. If there is any 'reform' it needs to address the training data more than the algorithms.

  9. Paul Hovnanian Silver badge

    Victor Borge

    ... was on the right track.

  10. Joe Gurman

    Glad to hear that Dragon Speech, at least has improved with time

    I recall using it a couple of decades ago on the Mac, and it ranged from almost passable to horrendous. Far too much work to correct all its errors, even after repeated trainings.

    My principal complaints with dictation in macOS now are: its steadfast refusal to learn (e.g. that I use certain non-English words regularly — what, the voice recognition AI hasn't been trained for yiddish yet?), and scientific usage.... as in, the Sun is always capitalized in astrophysics, solar physics, and space weather journals. As a 'Murrican, the British periodical of the same name and capitalization has little to no resonance for me.

  11. big_D Silver badge

    Not only voice...

    I feel for the author and the people he interviewed. I have the problem doubled, in that I dictate or speak to Siri in both English and German and it often confuses the system.

    The same is true when typing. On the iPhone, it turns verbs, especially German verbs into nouns (capitalises them) and it often goes back and changes words earlier in a sentence, that you have already checked, with random other words, either changing the context of the sentence or turning it into complete jibberish. As you have already checked those earlier words, you often don't notice they have changed, but I will often be looking at the text whilst typing and notice random words in other parts of the text changing.

    Android does this as well, to a lesser extent.

    macOS does some autocomplete and if you are typing a word it doesn't know it will always replace it with a known word! I was typing a reply yesterday and it replaced a company name it didn't know with a word from its dictionary every time I spelt it out in full and pressed space. Being a touch typist and looking at some source material whilst typing, I failed to notice this for a while and had to go back and replace all instances with the company's name. I have since turned off the autocorrect feature in macOS.

    I hope Apple really do pull their finger out, or realise that it is harder than they thought and go back to dictation software makers and work with them, again.

  12. ericsmith504

    Apple had a long history of being a market leader in accessibility features. 100% agree that there's a real need for the company to spend some of it's billions in the bank to again be that market leader. We all benefit from improvements in these features and I for one have wanted to sometimes launch my iPhone into the sun as punishment for its crimes against grammar and general speech recognition and would love to see improvement.

  13. PRR Bronze badge

    > As a user with unique needs,

    Colin, your needs are not "unique". Life is full of people with MD-like conditions affecting their hands and even their voice (breathing). Just that most folks manage to ignore them.

    You posted 2,500 words 15,000 characters. I am in awe of that extended effort. Why would you do that? Because it is IMPORTANT, to you and to others you know.

    If it was just Apple flop-flipping between two dictation apps, bad enough. The market is not there (yet) (*) to develop and support two dictation apps, and Apple is usually good at picking ONE killer app. But to also harass Dragon into quiting the market--- this sounds like anti-crip policy, discrimination against differently-abled CUSTOMERS.

    (*) Actually, 20 years ago my friend wanted to "just talk to my computer". She typed fine, but talks better. None of the "talk to" apps since really fit her vision. Would she pay more for a fully speech-driven computer? If a billion people wanted it, would any developer work at it?

    I feel your pain even though my abilities are different. I can type and mouse OK (on good days) but my hearing is shot. So what? Well, everybody thinks video is the way to pass knowledge: YouTube, TikToc, Vimeo. Many can't manage a microphone. And the idiot video codecs degrade the audio rather than let video break-up. Phase-shift like a sewer. Yes, YouTube will on-the-fly transcribe subtitles, but with errors reMARKably like you describe in speech-to-text. "Lineman" as "alignment", "crews" as "cruise". I know speech understanding is tough!! (My partial deafness adds insight?) But Dragon had it 90% solved twenty years ago. (I will note that The Register posted a 4-way video chat on a platform which does not seem to do any speech transcription.)

    In the meantime we have developed money without a hint of precious metals, wanton copying untroubled by copyrights, global conspiracies without facts, viruses that demand ransom to not-delete data, or better: to send a video of me watching an XXX site to all my contacts.

    If computers are supposed to "be good at the things people are not good at" (else why buy the silly things?), then since 'universal problems' like math(s) and communication and some spell-checking are semi-solved, this should now include less-common less-good abilities like manual (with hands) user interfaces and sound-based output. Ah, but how to monetize it?? (And Dragon did OK financially until whatever you say Apple did to drive them away.)

  14. -tim

    Apple won't follow their own stnadards

    When Apple introduced MacOS 13 Ventura, they added a feature for right click to cut an object out of a picture effectively removing the background of an image. The problem is this takes a while and then the right click menu gets another item added at the bottom. If the screen is set up for people with poor eyesight, the menu will jump up just as a menu item is selected. That feature is often used for "Open image in new window" followed by zooming to be able to see the image properly.

    That new "Copy Subject" feature needs to have a way to disable it. It wastes power as it runs the GPUs full speed and some of us never want to use it. Adding the extra menu option after a few seconds goes against apples own design guidelines and the option should be grayed out until it decides if it will work or not. Apples own page on the feature says "it might take a few seconds for Copy Subject to appear." Meanwhile it is burning through battery power for a feature is mostly used for creating copyright violating memes.

  15. Stuart Castle Silver badge

    Of course, the sad thing is that there are some very good solutions out there for making computers (and consoles) accessible to disabled users, if those users can afford them. Sadly, in a lot of cases, those users can't and while they may qualify for help from a local authority, or charity, the budgets for those organisations are limited, with the limits effectively being reduced year on year.

  16. Anonymous Coward
    Anonymous Coward

    Sorry guys...large incremental dev cost, no extra revenue.

    First up I qualify in a bunch of categories under ADA in the US. But I 've never gone around whining for special treatment etc. Thats just the way it is and you work around it.

    I have also lead multiple dev teams over the year for mass market consumer software products. We shipped millions of SKU's. Given all the usual problems of actually getting a product out the door (the dev project failure rate has stayed about 80%+ for decades) "accessibility" is at the very bottom of the list of things we worry about. Depending on the application and the platform its a non trivial amount of dev time and the potential return, if you actually ship the product, is a tiny fraction of 1% revenue. That's the brutal truth.

    Since the mid 1980's there have been specialized products to cater to this very small market. They cost a lot of money because the market is small for those willing to pay for the large amount of engineering work that is need to ship these products. Thats just the way it is. You want custom "accessible" products, which what they are, then show us the money.

    As for speech recognition. Especially for uncommon accents or for speech impediments. That's a hard problem to solve. Good enough for Siri queries, do able. Good enough for accurate dictation transcription, thats a difficult problem. Which means it takes a lot of money to solve. And the solution for the edge cases will never be perfect. Untrained speech recognition has improved enormously since I did my first technology / product review of one of the engines 25 years ago and as others have mentioned the engine that Dragon use is about as good as it gets. Or likely to get. If Dragon does not work for a person then I'm afraid they are out of luck.

    As for Apple and Speech recognition. The fact that Dragon dont ship a MacOS product tells you all you need to know about how small the potential Mac user base is. So Apple themselves are not going to spend much money on it. And for those who want to use a legal stick to force software developers to support very expensive "accessibility". Suggest you read up on how the FASTER law worked out when the Feds made onerous requirements on food manufacturers for allergic reaction risk for a tiny fraction of a percent of the population. It was horrifically expensive to prove there were no allergens anywhere in the manufacturing process (in this case sesame) so the food manufacturers just added trace allergens to products which previously did not have them and then stuck on a warning label.

    That's how the real world works.

    So beware on unintended consequences. Because they are absolutely guaranteed. Apply ADA to software and just watch all the small developers and low volume / low margin application disappear. For everyone.

  17. cactustweeter

    Apple voice control my perspective

    Thank you Colin Hughes for your article and perspective. I like you have a severe physical disability. I do not have muscular dystrophy. I am a high-level quadriplegic. I do not require a ventilator. I can shrug my shoulders but I cannot move anything below my shoulders. I operate my computer completely by voice.

    I agree Colin Apple needs to keep improving Voice Control. I disagree that Apple has not done anything to improve Voice Control since it's release with Catalina. I started using Voice Control with Big Sur. Today I'm using Voice Control on macOS Ventura. From Big Sur until Ventura there has been more than 100 commands added. I believe Monterey added the ability to import/export vocabulary/commands. In Ventura dictation accuracy greatly approved. Ventura also brought us spelling mode functionality. If iOS/iPadOS 17 are any indicator of things to come with macOS Sonoma I'm very excited to upgrade.

    I know many were fond of Dragon from Mac. I used Dragon for Mac from the first version until it's last. Dragon from Mac had great dictation accuracy but it's command-and-control was pretty lousy. Voice Control's command-and-control has been pretty incredible from the start and it continues to get better.

    How can the disabled community get Apple to improve Voice Control? Writing articles about its shortcomings is one way. Another way is to provide feedback. This can be done multiple ways. One way is through the feedback webpage ( Another way is to participate in beta testing. During beta testing you get to report bugs but you can also provide guidance on how the product can be improved. Probably the least effective is bitching about Apple in general.

    1. Colin Hughes

      Re: Apple voice control my perspective


      I appreciate your engagement and your insights into Voice Control. I agree with several of the points you make.

      Unfortunately, I spent three long summers buried in the betas feeding back, and offering up suggestions through the feedback page, and through Apple’s accessibility department, all to no avail.

      Far from bitching or venting frustrations at Apple, my intention with the piece is to shed light on the limitations and gather opinions from the community, and I’m not surprised to see that many share similar concerns. I have always found Apple very responsive to publicly expressed advocacy. It’s one of the company’s strengths in my opinion.

      Unfortunately, Spell Mode never made it to the UK where I am located. Strangely, it was launched US only. I don’t think it has arrived here in Sonoma either. Apple hasn't said.

      Spell Mode is a blunt instrument as it doesn’t learn from your corrections so the same recognition errors keep happening over and over. If Voice Control doesn’t recognise how you pronounce the name of your dog Rover it is never going to learn and there is nothing a user can do. .

      It’s a pity Apple doesn’t do more to publicise new commands. You will be hard pressed to find them mentioned in release notes.

      Bulk import of vocabulary isn’t very helpful when a lot of the vocabulary I want to import are proper nouns and the capitalisation is always ignored by the app.

      Commands are very helpful for navigation, and for navigation specifically, I acknowledge Voice Control is good However; echoing your own personal tip, I have had to report several bugs with commands in Sonoma recently. How long they’ll take to get fixed is anyone’s guess.

      The focus of my article was dictation with Voice Control and I never saw an improvement in Ventura. However, I am seeing a slight improvement in Sonoma but accuracy will always be compromised until Apple fixes the proper noun issues that have persisted since Voice Control launched four years ago. It’s not only about “Sun” and “Will” there are many others it fails on, too many to mention in the article. I reported the issue through the feedback app and accessibility department on numerous occasions over the past four years, all to no avail. It was this lack of action on feedback that prompted me to write the opinion piece.

      I respect your positive experience with Voice Control for certain tasks. However, I believe that it could and should evolve into a more robust dictation application for long-form content.

      As a voice command/ navigation application Voice Control is powerful. Commands are short phrases by nature and Voice Control copes with short dictated phrases like “Happy Birthday “ or “I will be home in twenty minutes” with ease but for anything longer the app’s dictation capabilities and productivity quickly falls apart. I have Dragon installed on my Mac with Parallels and the difference between the apps is night and day, and I readily accept several hundred pounds/dollars.

      As one of the commenters here rightly pointed out, it wouldn’t take much for Apple to address many of the issues and enhance Voice Control’s productivity, benefiting users of all abilities. I hope that this article resonates with someone within the company who can make a difference. The time for Voice Control to mature as a reliable dictation tool is long overdue.

      Here’s hoping a piece like this at least makes a contribution to improvements next year. At the moment that feels a long wait.

      Thank you for your thoughtful engagement, and I appreciate the opportunity to continue this conversation.

      Best regards


POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like