" It's also of great help for those with sight problems. If you have hearing problems then, you're not going to buy an audiobook in the first place"
Assuming that everyone has either perfect hearing or perfect vision, or that everyone with hearing/sight loss has total loss in that department is rather dubious. Disabilities are very rarely binary.
I can read books, but it get very tiring due to my visual disability, similarly I can listen to audiobooks (assuming they're streamable directly into my hearing aids) but it gets tiring.
Combine the two and the additional processing my brain needs to do is vastly reduced -> result - more reading (and therefore more sales of books)
Note - I'm just one person, with one peculiar set of disabilities. As has been pointed out elsewhere this would be really good for learning another language - and that's a whole lot more people.
Given that the text is available the 'on the fly speech -> text' seems pointless it's not as if we don't have a perfectly good subtitle file format already (sub or ass, I don't mind) - feed the ML system with the book and the audio and it can probably autogenerate a very good subtitle file - which could be easily distributed with the audiobook.
Now if they are trying to patent the idea of subtitles then that's another thing entirely.