Daniel Lemire's blog

, 4 min read

DeepL is as good as human translators?

How good is automated translation in 2017? There is a new company called DeepL that seems to have “cracked” the translation problem in the sense that it makes fewer errors than non-professional human translators. That’s my claim, not theirs, but since their system is online for anybody to try, you can make your own experiments.

I decided to take an extract from a Michel Houellebecq‘s novel to illustrate objectively the point. It is in French, but I do not assume you can read it:

Charny il ne reste qu’une épicerie ; mais la camionnette du boucher passe le mercredi, celle du poissonnier le vendredi ; souvent, le samedi midi, sa grand-mère fait de la morue è la crème. Michel est en train de vivre son dernier été è Charny, mais il ne le sait pas encore. En début d’année, sa grand-mère a eu une attaque. Ses deux filles, qui vivent en banlieue parisienne sont en train de lui chercher une maison pas trop loin de chez elles. Elle n’est plus en état de vivre seule toute l’année, de s’occuper de son jardin.

Let us first try Bing Translator by Microsoft:

In Charny there is only one grocery store; But the butcher’s van spends Wednesday, the fishmonger’s truck on Fridays; Often on Saturday lunchtime, his grandmother makes cod with cream. Michel is living his last summer in Charny, but he doesn’t know it yet. At the beginning of the year, his grandmother had a seizure. His two daughters, who live in the suburbs of Paris, are now looking for a house not too far from home. She is no longer in a position to live alone all year, to look after her garden.

It is good but Bing gets the capitalization wrong, and “the butcher’s van spends Wednesday” is a bad translation. I think it is slightly worse than what a non-professional human translator would do.

[How well does Google translate do on this passage](https://translate.google.com/#auto/en/À%20Charny%20il%20ne%20reste%20qu'une%20épicerie%20%3B%20mais%20la%20camionnette%20du%20boucher%20passe%20le%20mercredi%2C%20celle%20du%20poissonnier%20le%20vendredi%20%3B%20souvent%2C%20le%20samedi%20midi%2C%20sa%20grand-mère%20fait%20de%20la%20morue%20à %20la%20crème.%20Michel%20est%20en%20train%20de%20vivre%20son%20dernier%20été%20à %20Charny%2C%20mais%20il%20ne%20le%20sait%20pas%20encore.%20En%20début%20d'année%2C%20sa%20grand-mère%20a%20eu%20une%20attaque.%20Ses%20deux%20filles%2C%20qui%20vivent%20en%20banlieue%20parisienne%20sont%20en%20train%20de%20lui%20chercher%20une%20maison%20pas%20trop%20loin%20de%20chez%20elles.%20Elle%20n'est%20plus%20en%20état%20de%20vivre%20seule%20toute%20l'année%2C%20de%20s'occuper%20de%20son%20jardin.)?

At Charny there is only a grocery store; but the butcher’s van passes on Wednesday, that of the fishmonger on Friday; often, on Saturday noon, her grandmother made cod in the cream. Michel is living his last summer at Charny, but he does not know it yet. At the beginning of the year, his grandmother had an attack. His two daughters, who live in the Paris suburbs, are looking for a house not too far from their home. She is no longer in a condition to live alone all year round, to take care of her garden.

Unlike Bing, Google does not randomly capitalize the sentences, but there are awkward bits. The “that of the fishmonger on Friday” passage is hard to parse. And then Google gets the verb tense wrong in “her grandmother made cod in the cream”. I don’t like “his grandmother had an attack”. Like Bing, this does not reach the “non-professional human level” threshold.

What about DeepL?

In Charny, there is only one grocery store left; but the butcher’s truck passes on Wednesdays, the fishmonger’s truck passes on Fridays; often, on Saturdays at noon, his grandmother makes codfish with cream. Michel is living his last summer in Charny, but he doesn’t know it yet. Earlier this year, his grandmother had a stroke. His two daughters, who live in the suburbs of Paris, are looking for a house not too far from their homes. She is no longer able to live alone all year round, to look after her garden.

DeepL is the only one to get “his grandmother had a stroke” correctly. It is as good as what most human beings could do.

All translation engines fail to attribute the daughters to the grand-mother. No professional translator would make such a mistake, but I think many of us would.

Yes, I know that judging a system based on a single passage is methodologically problematic, but I ran many more tests that support my claim that DeepL is far above Google and Bing. I could elaborate further, but I’d encourage you instead to try it out.

Credit: I found out about DeepL via Peter Turney.

Note: My wife is a professional translator. I’m not claiming that she is about to become obsolete. Not by a long shot. But she is a professional translator, she is much better at translation than 99% of us.