Machine translation quality, as we all know, has yet to reach acceptable levels in many languages. The ways in which we use language in real life, and the idiosyncrasies of each language, are not easy for a machine to learn and reproduce.
Hence, one approach suggested has been to write in a way that machines can understand, that is, to use controlled language.
Controlled language is a subset of a natural language with restricted vocabulary and grammar, designed to reduce ambiguity and complexity. It is commonly used by software, aviation, and automobile companies for certain types of text, like software strings or technical documentation.
Some of the rules of controlled language stem from plain language, which requires you to write in a clear, precise way and allows readers to digest the text more easily. Plain language has been enacted into law by several countries, including the US, and is the best possible way to inform, empower, and even to market.
But controlled language also differs from plain language in significant ways. First, let’s take a look at the ways in which controlled language is similar to plain language.
Where Controlled Language Meets Plain Language
- Write sentences that are shorter than 25 words.
- Write sentences that express only one idea.
- Write sentences in the active form.
- Write sentences that use words from a general dictionary. Do not use technical words.
The above rules help create succinct, clear text that can be easily understood at the first attempt.
Where Controlled Language Differs from Plain Language
- Write sentences that repeat the noun instead of using a pronoun.
- Write sentences that have a simple grammatical structure, i.e. avoid juxtaposition, subordination, relative pronouns, etc.
If you follow the above rules of controlled language, your writing might look something like this:
Open the packet and put the packet on the table. You should be able to see the packet on the table easily.
…instead of more natural writing, like this:
Open the packet and put it on the table, where you should be able to see it easily.
Clearly, controlled language is not ideal for customer-facing content – and its proponents acknowledge that. It makes for stilted, unnatural-sounding language and, to some extent, it also seems overly simplified (something which plain language is decidedly not). Controlled language also ends up increasing the word count of sentences, which is contradictory to its first rule of creating shorter sentences.
Controlled language urges writers to write for machines, so that they can easily understand the language and automate translation. But that defeats the purpose of writing itself. We don’t write for machines, but for humans. If machines have a tough time figuring out human language, too bad.
Sure, controlled language comes with the disclaimer that it’s only meant for certain types of text, but don’t consumers of technical documentation deserve easy-to-read text, too?
Machine translation itself comes with the disclaimer that it, too, suits only certain types of text. Which means that controlled language doesn’t necessarily extend the scope of MT. Plus, controlled language authoring can be expensive and requires time-intensive training – not the best option when you need to quickly deploy multiple language sites.
So, however painful it may be for machines to learn our language, that’s the only way there is. It can never be that we write for machines. And, it will be a very, very long time, if ever, that translators stop being the bridges between people.