Project FAUST: Humans Helping to Improve Machine Translation

How far is science from creating a real live Babel Fish, the legendary gizmo from The Hitchhiker’s Guide to the Galaxy that can translate everything? Project FAUST is an EU-funded project aimed to develop machine translation systems that are able to respond “rapidly and intelligently” to user feedback. But in spite of the name, nobody is selling their soul to the machines. The humans involved are volunteers who are asked to rate different translation and localization services and make suggestions.

Statistical Translation

Back in the 1950s, linguists and scientists thought that once they had put all the rules of grammar in code, it would be easy to develop machine translation. In the end, computers always failed to assimilate the complexities of language. But 30 years later IBM came up with a new perspective: they analyzed the relative frequency of different groups of three words occurring in a sentence. It was the beginning of the statistical approach, which is now the major branch of machine translation used by services such as Google Translate.

FAUST: Feedback Analysis for User Adaptive Statistical Translation

Project FAUST is trying to address two different problems with web-based machine translation. On the one hand, current systems procure high-volume translation but little actual interaction. On the other, there are systems that ask users for feedback and do not introduce any change in the translation as a result. The objective of the FAUST project is to “develop high-volume systems capable of adapting to user feedback in real-time.”

Limits of the Statistical Approach

FAUST is developing new data collection mechanisms to compile user quality assessments and feedback for amendments from commercial statistical machine translation systems. But the problem with statistical translation in general is that computers still do not understand the meaning of the words or know any grammar; they depend entirely on the collection of massive numbers of source texts, which becomes more difficult when less-spoken languages are involved. Another problem is the “Google time loop” that happens when one of their own translations appears again, reaffirming its correctness when it’s actually an old mistake coming back.

The Babel Fish

How far are scientists from resolving all these problems and creating the Babel Fish as conceived by novelist Douglas Adams? Researchers are advancing one step at a time. FAUST wants to contribute by asking humans to rate translation services and suggest a way to improve the interpretations the machines get wrong. Not as stunning as the small yellow animal that, inserted into the ear, performs instant translations of alien languages imagined by Adams in The Hitchhiker’s Guide to the Galaxy, but considerably more useful for the time being.