Everyone does it
The Irish tell Kerryman jokes, the Americans laugh at Polacks, the French at Belgians, and the rest of Germany ridicules Bavarians (Bismarck: "A strange fellow, your Bavarian; halfway between an Austrian and a human being.")
There's been quite the little chortle in this part of Central Europe this week regarding the actions of a tourist board in Moravia. To set the scene, Moravians are thought of as the slightly slower country brothers of the Bohemians, (or “true Czechs” ... one local bar has “We have Moravian and Czech wine” in the window) in …
The irish tell jokes about people from Kerry. People in Kerry tell jokes about one particular village. The people of that village tell jokes about one family. The people in that family tell jokes about their dad. Who actually does all these things. He gets home from work and says 'I saw a guy on the bus today. He had custard in one ear and jelly in the other. I said "Are you a trifle deaf?" and he said, "No, I'm a mental patient."' And the family stand around saying 'I don't get it'
-Paul Merton
Unlike the US , where Racial Theory and it's subsequent Eugenetics "science" was actually conceived and heavily promoted, even up through the 60's, amongst scientists and government, and still holds firm ground in quite a few of the more ....conservative... areas ( be it physical or political) ?
The only difference between Germany and the rest of the "Civilised World", is that they incorporated and formalised the concept with the usual Gründlichkeit into Policy. In the US ( and the British Commonwealth/Colonies) it was already so ingrained into culture it didn't need formalising. It was already in effect in full force, and it got worse in the US after WW II.
Oh , and the basic concept of the concentration camps weren't new either.. The Nazis were quite taken with the efficiency of Dealing With Things in the US, and simply copied the concept of native-american "reservations", and put them through the Gründlichkeit mill.
While we're tangentially talking about the Irish and mistranslations, let's not forget the Gardaí scooping the Ig Nobel prize in Literature back in 2009 for something similar.
The economics of why they couldn't hire a professional translator despite having a huge budget. Or cultural working practices which allow bodge jobs like this.
Not AI. Who said AI has to understand the world using human language? The original Terminator used 6502 machine code and still managed to wipe out a load of humans.
The usual 'fiddle' for euro-dosh, in fact it hard to really call it a fiddle. EU offers money for regional development/ tourism etc. Enterprising mayor or other local official plays the game for all its worth. Local economy benefits. Keeping the benefits local means employing who is available locally rather that bringing in qualified outsiders.
I know a village in Italy which has an extraordinary number of brown tourist signs and maps of (imagined) footpaths and cycle ways. There are few tourists but it keeps the local signmaker in business
ref. economics of big budget v. AI translation? Simples. They apply for an EU grant because they're starved of local (national) funding. Or they believe they're starved. Or believe they should get more funding. Or somebody showed them how it can be easily got by applying for a grant. So they get back to this grant application support service (you can't apply for grants directly, and where you can, rules are fiendishly complicated), which take a fee. Plus a cut from a grant once receive. Actually, 25K is peanuts, I once saw a grant worth a couple of milion EUR, to restore and scan several dozens old maps at some Lithuanian library. That was a good few years back, though the scans haven't surfaced yet, one must be patient with such maps). So, they get a grant, buy a couple of pretty toys (at hugely inflated prices), also find a couple of "roles" for their existing staff to maintain the project, then obviously comes a web designer biz (completely unrelated to the grant applicant and middle man) with a turnkey solution, including the abovementioned cost-optimised translation. End result - everybody happy.
Why, are you not, the dear end user and potential income source? Well, fuck you, we did our best to make you come!
On a relevant and related note; IIRC Arnie is always dubbed into German by someone else, despite speaking the language natively. Apparently he hails from a rather bucolic corner of Austria and nobody is scared of a killing machine that says:
<Wurzel>
Oi'll be baaaach!
</Wurzel>
Who said AI has to understand the world using human language?
The people who want to use it for Natural Language Processing.
I suspect someone will say Turing did, thereby demonstrating a complete failure to understand Turing's point - which was essentially a rejection of metaphysics over pragmatism in developing an epistemology of mind.
Saved my lard, in French speaking bits of Africa. Anyway, hints and tips:
- Replace proper nouns with something known, before you translate (and put them back in, after translating).
- Same with any industry specific technical terms, as they can cause sentence structure to break down badly.
And remember, it's a courtesy machine translation, not an attempt to trick people into thinking you know the language (Google Translate will gladly prove otherwise). For web sites. brochures and so on, consider employing an actual human translator. They'll appreciate the business.
Now, did you hear the one about the Irish.....
No its utter crap. I regualrly use it to 'translate' German and Italian. It puts words in the wrong order, often says the opposite of what the original lanuage says and for some stupid reason translates people's names. Sometimes you need a degree in translating the output of Google Translate to make any sense of it.
This post has been deleted by its author
@Irongut agreed.
I had to translate a user manual into German from English. I thought I could save a bit of time by putting it throught Google Translate and just tidying up. After my sides stopped hurting and I could climb back on my stool again from laughing so hard, I started from scratch.
Translate seems to have real problems with formal English as well. If you use "do not" it translates it as "do", if you use "don't" it translates it as do not... Not very helpful if you are writing safety instructions. It translated "do not open the case, high voltage electricity inside" into the German equivalent of "open the case, high voltage electricity inside." Google had obviously noticed that I had swapped my Android phone out for a Lumia.
Or, how about, "do not open the case, no user serviceable parts inside"? That got translated into "do open the case, no parts inside." I was robbed during translation!
I wouldn't like to use it to translate into a language I didn't know - that's what professional translators are for - but for translating some unknown foreign text into 'usable' English it is pretty good. And free.
I wouldn't like to use it to translate into a language I didn't know - that's what professional translators are for - but for translating some unknown foreign text into 'usable' English it is pretty good. And free.
I found, for my industry (software for the food industry, especially meat processing), Google is fairly useless at going back and forth between German and English.
Translating documentation and press released that my company produces (I am theoretically a project manager, but have to rush out translations all the time), I find that Google is good for about 20% of the translation, I tend to use Leo and Linguee much more often. I have noticed that Bing Translate has improved recently, I did a head to head between Goolge and Bing on a press release, around 20% in Google was usable, in Bing probably closer to 40%.
Using your noggin and Leo / Linguee is still a much faster, safer and more accurate translation, if you understand both languages.
No its utter crap.
At least it knows the difference between "its" and "it's".
It's very successful at doing what it was designed to do. You're asking it to do what it's advertised as doing, which is quite a different thing.
I find Google Translates passable if a) you're translating into your language (i.e. for informational purposes) and b) you have at least some knowledge of the from language to spot the obvious errors. Even then, I've had it completely reverse the meaning of a sentence - I've no idea how.
My favourite Google Translate error, though, was merely a matter of it trying to be too clever. We do various Oracle DB software, and one of our Swiss customers raised an issue talking of "der DB-Fehler", which Google Uebersetzer decided to render as "the German Railways error".
"My favourite Google Translate error, though, was merely a matter of it trying to be too clever. We do various Oracle DB software, and one of our Swiss customers raised an issue talking of "der DB-Fehler", which Google Uebersetzer decided to render as "the German Railways error"."
Dealing with that is easy. Replace DB with Pig, before translating from English to German. Then replace Schwein with DB, in the German translation. With this approach, you stand a chance of the grammar butchered but still good enough for a native German speaker to get it.
I've had it completely reverse the meaning of a sentence - I've no idea how.
Not at all an unexpected result if you do a bit of research into linguistics and machine translation, particularly the huge-corpus approach used by Google.
You take some corpora in various languages that have been annotated by human judges - parsed, basically, into part-of-speech and phrasal grammatical structure. You train some ML model (Google likely use NNs, because they have an unnatural affection for the things, but you could use HMMs or MEMMs or various others) based on those inputs. Presto, you have a probabilistic model for parsing the language.
Then you take your huge corpora of texts that you have versions of in multiple languages. Remember when Google indexed the web and scanned all those books? Yeah, that. You use that to build probabilistic maps from language A to language B, for various pairs {A,B}. You can use longer chains (A->B->C) to add more information to the model, but you have to weight it lighter because successive translation introduces more noise, so there's a point of diminishing returns.
When you get fresh input to translate from A to B, you first check to see if you already have a translation of the whole thing, or of big chunks (sentences, say). If not, you parse the input into phrase-level chunks, using a model from the first set, and then translate those chunks, using a model from the second.
This sort of approach, with a bit of tweaking and a really big set of corpora to train from (which is what Google have) has a pretty good success rate - somewhere in the 90%-95% range on typical inputs.
What about the sentence-meaning-inversion thing? Natural languages have many, many ways to invert meaning at the phrase and sentence level, and language users keep coming up with new ones. Often this can hinge on the presence or position of a single word, or punctuation, or context from other sentences in the text - consider sarcasm, for example.
And then there are sentences which are simply ambiguous in language A but can only be translated into one of a set of non-ambiguous sentences in language B, for example because of grammatical inflection in B. (All natural languages admit all sorts of ambiguity, of course, but they have different constraints on its particular forms.)
Meaning inversion is a very easy trap for blind-translation models like Google's to fall into, because a given phrase often has inverted-meaning local maxima in its probability distribution.
I've found out the same thing.. it's mostly usable if.. I do a lot of communicating with several Frenchmen on a hobby interest (18th Century Warships) so what we do is type emails in our native language and paste in the Google Translate. Give many terms are archaic, we manage pretty well all things considered.
Google Translate often offers alternatives. Presumably all the "corrections" that users are invited to submit are given a weighting depending on the number of times they have been submitted. It is tiresome having to keep correcting its translation of the German "messe". It uses "fair" rather than "mass" - although possibly that is down to it not understanding the context.
I use it every week when trawling web pages in English, German, Austrian, French, Dutch, Italian, Spanish, Catalan, Norwegian, Danish, Swedish, Icelandic, Russian, Polish, Latvian, Estonian, Lithuanian, Bulgarian, Czech, Slovak, Chinese, Japanese - plus a few other less common ones.
A slight problem is that it often completely omits words in the translation - usually adjectives. Yet elsewhere in the same piece it shows it can translate that word.
Fair is the correct translation for Messe, though not fair as in fair play, but as in a fair being held in a fairground. The translation of "mass" would be "Masse", not "Messe".
As for the corrections: that options is gone from the mobile app and I don't have the idea that Google does anything with it. It still translates Dutch "zwaluw" to the French "avaler" instead of "hirondelle".
Why? Because a zwaluw is a swallow and the verb to swallow is avaler in French.
So another tip for using Google translate: translate from English, even if English is not your first language. (Although in this case you still don't get the bird, just the verb)
This is quite interesting. Out of curiosity, I just typed "Is that a swallow" (no question mark) into Google Translate and I got the expected "est-ce une hirondelle". The funny thing is, it changed into that at the last moment - while I was typing, one letter short it was still showing "est-ce une déglutition" which seem to be related to the _other_ noted meaning...
Maybe the addition of "a" or "een" signals that the verb should not be used.
Think of the input as an N-dimensional vector pointing into an N-space of possible translations. Each complete word adjusts that vector. Google shows you the data point in the translation space that's closest to the end of the vector.
<<It [Google] still translates Dutch "zwaluw" to the French "avaler" instead of "hirondelle".
Why? Because a zwaluw is a swallow and the verb to swallow is avaler in French.>>
"One swallow does not make an orgy" - the late, great Willie Rushton on I'm Sorry I Haven't a Clue's complete the proverb round.
"The translation of "mass" would be "Masse", not "Messe"."
I have seen the German "Masse" translated as "mass" when talking about the mass property of an object.
German music sites use "Messe" for the religious service and the English translation they give is the expected "Mass". Google Translate usually gives "fair" or "trade fair".
Googled using the keywords : facebook knabenchor messe
Facebook postings for several German choirs use "Messe" in the context of a religious service.
Here is an interesting example where in one paragraph Google gets it right and wrong. Correlated with other examples - Google Translate appears to apply the correct context if it recognises the format of a church name in the same sentence. Bold indicates the translated words.
"Mit einem Foto von der Abendmesse in San Bonifacius, Verona, und einem anschließenden kleinen Konzert gemeinsam mit dem Chor Hartolan Viihdekuoro aus Finnland wünschen wir eine gute Nacht. Am Sonntag singen wir noch einmal in einer Messe in Verona"
"With a photo of the evening Mass in San Boniface , Verona , and a subsequent small concert together with the choir Hartolan Viihdekuoro from Finland , we wish you a good night . On Sunday we sing once more in a trade fair in Verona"
The robots, the algos, unrestrained aren't about to take all our jobs. Simply because they're not yet very good at doing things which we humans do without much effort, which is to distinguish between different potential meanings of words and put them into context on the fly.
This project strives to solve exactly this problem. Not yet there, but...
You have to be so careful with automatic translation... Machines have no idea about idioms and colloquialisms, so avoid them like the plague.
Sometimes people just don't realise that their local name for something isn't universal... This is especially true of food. I've seen English menus which are a literal translation of the local language, which are no more help to me than the original.... "Princess steak - Steak cooked in the traditional princess style" (I think it was princess)... Perfectly good English, just completely useless!
If you want to see truly bad machine translate, try bing. I've yet to use it to translate any of my foreign Farcebook friends into anything more than gibberish.
Ah yes, reminds me of my favourite restaurant in Brussels. Now long since ruined. Although, I've not been in a while, so I can hope for redemption.
They had menus in 6 languages. But the English translation of tete de veau was "tete of veal". Which is all very well if you remember that tete means head - but if you just think it's a french cut of meat, then you're in for a surprise when you open the stock pot and there's a baby cow's head staring up at you.
There weren't many screams while I was eating there, so I can assume that not too many people made that mistake.
Or perhaps the waiters were aware, and warned people. My brother bought some horse from the butchers, when on holiday in Sardinia. It was deliberate. But the butcher knowing that the English are traditionally squeamish about eating horses tried to make sure, to his credit. He didn't speak english though, so this involved making ear signs with his fingers while blowing out his lips and making brrrr and whinnying noises.
In a traditional restaurant in Northern Italy I was given a white cube of something as a starter, with a square of pork fat carefully placed on top of it. Upon politely asking what it was, the waitress answered that it was 'mice'. After a bit of inspired Give Us A Clue, she kindly refined her answer to 'maize'.
In a greasy spoon in Hong Kong where only one person spoke more than a few words of English, I asked if they had an English menu. I was handed a piece of well-worn card upon which were the handwritten translations of the Chinese names for each dish. No.13 was 'Three Flower Delight', No.6 was 'Trophy of Kings', etc.
"In a greasy spoon in Hong Kong [...]"
The Daily Telegraph often has a feature of pictures of signs mistranslated into English.
Here is one specifically for Chinese signs.
http://www.telegraph.co.uk/travel/destinations/asia/china/11274952/Lost-in-translation-hilarious-mistranslated-Chinese-signs.html
When I was working just outside Brussels the hotel had an English version of the menu which had obviously been translated using a dictionary and no intelligence. 13 years on I still remember "Roast Bergylt with St Peter's Shells". It turned out to be baked haddock with clams. Their "wild rabbit terrine" was hare paté and so on. Happy days!
Nothing contentious in stating that:
a) Natural Language Processing is a hard problem and translating from one to another gives compound difficulty
b) Lazy design choices may lead to ridicule
Mind you, if the AIs do take over and they don't have good NLP then they'll make human grammar nazis look like liberals. Sarcasm will be the first casualty, followed by slang, metaphors and similies and we'll all start talking very slowly and very clearly, just in case the pervasive AI spy systems take something out of context and send a T1000 to purge the deviant meatbag who said it.
Surely sarcasm is the main defense we have against them.
Many moons ago I was part of a student exchange with a twin town in Germany. Obviously our German was pretty useless and their English was excellent. We did however make the mistake of teaching a few of the group sarcasm. From that point on their accent/delivery meant we couldn't tell if anything they said was serious or sarcastic.
Natural Language Processing is a hard problem and translating from one to another gives compound difficulty
I have read that the basis of Google Translate isn't the traditional Natural Language Processing algorithms that get confused by arrows and bananas. They realised that despite decades of work parsing algorithms that translate by extracting meaning are still inadequate. So it basically translates by searching for existing translations of words and phrases. I assume that there must be a lot of algorithmic processing to assign appropriate weight to selected translations, too.
The compound difficulty aspect is interesting. Because of the way it works, Google Translate can translate between language pairs where there is no history of human translation.
There's a non-techical article here.
Interesting article. I see that it claims that Google use English as an intermediate language in cases where no direct translations exist between source and target. It would be interesting to see how Google's approach compares to a traditional NLP parsed approach in terms of accuracy.
"I always thought that was really Transylvania. Y'know, the people the Slovaks get to laugh at ..."
S'okay, we do get in turn to laugh at all the westerners who never fail to associate us with Dracula but have no idea what country Transylvania is actually in (no, not Mordor). Full circle then...?
Well, while Transylvania is in Romania, that guy Dracula (at least according to Bram Stoker's novel) was a member of the dominant group of people around there at the time who trampled on the Romanians (then called Wallachians). So it's only fair the Hungarians get the blame for the vampire shenanigans instead of the Romanians, even if it's a pity the Romanians don't get their share of the tourist business.
Websites can be edited for not much more than it would have cost to do the thing properly in the first place. Restaurants can be renamed, albeit it costs more because you have to replace the signage, menus, and any other customised decor. If you want something really expensive to fix, how about a TV programme?
A BBC series called Episodes had a gravestone which said in English that the departed would be dearly missed, and in Hebrew that he would be pickled at great expense.
I saw a news story from Wales, possibly in El Reg a few years back. It was painted on the road in Welsh. It should have said something like "Give Way to oncoming traffic". But what it actually said was "I am currently out of the office and will return tomorrow".
They'd sent the email off to the usual person who did their translation, and painted the response on the road. Oops.
Stribrnice is name of a town, and also of a fish.
Google Translate's custom search pattern matching (knows as 'poteto potato') decided it's worth a try.
Czech : Stříbrnice evropská
Latin : Argentina sphyraena
English : Argentine
https://en.wikipedia.org/wiki/Argentina_%28fish%29
So when someone tells you about their passion for the ' Greater Argentine', maybe it's good to clarify first.
For a recent visit to the Czech republic for a few weeks (and having done so several times in the past) I decided to give learning Czech a crack. Calling it a complicated language is an understatement. Especially the verb conjugation is rather complicated with a lot more options/versions than plain english/dutch/german. On top of that they even conjugate the nouns. Jedno pivo is one beer. Dva piva is two beers. Pět piv is 5 beers... Saying, I'm going to the hangar can be said in a single sentence composed of a single verb.
Getting your head around it is tricky. I imagine translating it correctly is even more complicated. On top of that there are not many Czech (especially of the over 35s) that speak any language over the border. Many young people now speak a modicum of english, but it can still be a struggle if you're trying to order at a restaurant with only older personel.
Touring Finland off the usual tourist trail we had a puncture. Luckily a small village garage was nearby.
Garages changing wheels often use an air tool to tighten the nuts - and they have previously proved impossible to undo with hand sockets. Asking for the nuts to be done to only sufficient tightness was a challenge. The owner only spoke Finnish. His wife was a Finnish native Swedish speaker who indicated she could speak English. After struggling for several minutes it was obvious we were getting nowhere. Fortunately my Swedish turned out to be just about adequate.
Our two other punctures were in Sweden and easier to get fixed. Strangely all three were on consecutive Fridays.
On the final Friday evening we were joking about "no puncture" - when the gearbox disintegrated near Helsinki. We were fortunate that our host's son was a VW Beetle enthusiast and could handle the garage's technical terms in Finnish, Swedish, and English.
Now I'm not superstitious - but we had arrived by boat in Sweden on a Friday the 13th.
Never forget when the company I used to work for wanted to offer an 'International Site'. They got someone or some company to translate a page to German, for review. The page was sent to a German colleague, with a request for comments.
His one and only comment was, 'Get another translator.'
I once used Google translate (English to French) when wanting some logs to burn on my wood burning stove. The store owner was aghast that I wanted burn his "records".
Unfortunately Google didn't realise the context was logs=wood, not logs=documents.
Thankfully I am now more or less fluent in French and don't use automated translations.
The European Parliament has simultaneous verbal translations for many of the languages in use by its members.
One day a German-speaking MEP was going on at great length - but in one passage the English translation channel went eerily silent. Suddenly the translator's frustrated voice exclaimed - "The verb man! The verb!".
Even if you don't use computer translation, you can get in trouble. Consider the Welsh municipality that emailed their staff translator to find out how to say "No entry for heavy goods vehicles". The answer came back promptly — "Nid wyf yn y swyddfa ar hyn o bryd" — so that's what they painted on the new road sign.
Unfortunately, that Welsh phrase wasn't their translation; it was an automated response that means "I am not in the office at the moment".
I was confused by the title of the article, but I eventually realized that it was about the AIpocalypse (starting with A.I. for Artificial Intelligence) and not the Alpocalypse - Alpo being a well-known American brand of dog food.
So you don't need Google Translate to cause confusion. A sans-serif typeface is enough to achieve that all by itself.
Last year it was reported in the main Finnish daily Helsingin Sanomat that Amazon had started selling Finnish-language e-books that were translated with Google from English classics. The samples shown were obviously hilarious or groanworthy. I have no idea if they kept doing it, one probably needs a Kindle to check that (no trace on the Amazon web site when I looked just now).
When I was an apprentice at a big motor manufacturer's, I attended a college as part of the course. Other motor manufacturers also sent their apprentices on the same course. One lad was from Rolls Royce, and he regaled us with the tale that, when a new model, to be called the Silver Mist, was ready to be introduced, the company sent all of its sales literature off to a (human) translator, who returned it with the name translated to Silber Nebel. The company said that no, the name must not be translated, it must stay as Silver Mist. The translator replied that the company might want to consider that Mist, in German was a rude word, and roughly meant excrement (but not so politely). Needless to say, the name of the car was hastily changed and retranslated.