* Posts by Jurgen

2 publicly visible posts • joined 15 May 2007

How Google translates without understanding

Jurgen

Roundrip Fun

Another interesting feature of the Google 'translator' is that take the title of this article, translate it from English to another language and then back and you'll end up in a completely different place in English. Or try it with slightly longer text and it's just gets worse - this would be the first thing I'd fix.

English:

How Google translates without understanding

French:

Comment Google traduit sans arrangement

Back to English:

How Google translated without arrangement

Oops

Jurgen

Paralink - 8 Google - 0

I recently evaluated several free translation engines available on the market on a simple paragraph of text using the same Spanish text as the source and English as the target language (you can always do that even if you speak only one language fluently like most people in English-speaking countried do).

Google produced 8 style errors and 2 grammar errors on the paragraph while another engine (Paralink) performed only 1 style error and 0 grammar errors, a massive difference. It shows that the competition where computer programs score other computer programs (with a score like BLEU) is pretty much meaningless.

The problem is obviously, as you rightfully point out, with the purely statistical approach which should be combined with case-based (exception-based) reasoning. Also, the Google team would greatly benefit from reading Pinker's book 'Words and Rules' which takes you through the history of different methodologies in linguistics from Chomsky to (statistical) neural networks and shows how the two can be combined.

Disclaimer: I do not work for Paralink nor do I own their stock; just wanted to set the record straight that Google is not the only game in town, nor is it even best or market leader. Now the problem is that it is hard to find Paralink using Google search, but that's a whole different story :). But thankfully we still have other search engines...