Progress in Machine Translation

Machine translation is nothing new—at least not in the realms of science fiction. In his 1945 novella “First Contact,” American author Murray Leinster (aka William Fitzgerald Jenkins, 1896~1975) deserves credit for proposing a device he called the “universal translator” (UT). By the time the series “Star Trek” debuted on U.S. television in 1966, the UT was imagined to be standard communication equipment for the 23rd century, instantly translating alien tongues into flawless English, breaking through cultural barriers and leading to universal peace.

But Leinster and the creators of Star Trek greatly underestimated how rapidly technology would develop over the next several decades. What they envisioned in those days as the scientific breakthrough of a distant fictional future was already being researched by their contemporaries—linguistic engineers whose inventions would lead to new fields in voice recognition and artificial intelligence as well as translation by machines.

Historic Milestones

In 1949, long before most of the world imagined what feats computers could accomplish, mathematician Warren Weaver (1894~1978) formally introduced the possibility of using digital means to translate documents between natural human languages. His scholarly paper entitled “Translation” covered four specific areas: identifying logical elements in language; addressing the problem of multiple meanings by examining immediate context; the application of cryptographic methods; and the likely existence of linguistic universals.

Two years later, researchers at MIT and Georgetown University began working on machine translation (MT) systems that would lead to public demonstrations in 1954 and a world MT conference in London in 1956. By 1962, the Association for Machine Translation and Computational Linguistics was formed in the U.S., and when a 1966 report skeptical of the progress of MT caused government funding for new projects to dry up, that’s when private enterprise stepped in.

In 1970, the French Textile Institute used MT to convert abstracts from and into French, English, Spanish and German. By 1978, Xerox began using SYSTRAN to translate technical manuals. A company called Trados was formed in 1984 to develop and market translation memory technology, and SYSTRAN launched on the worldwide web in 1996, offering free translation of small texts. Then, AltaVista brought its MT system named “Babel Fish” to the Internet, garnering as many as half a million translation requests per day in 1997.

As the new millennium unfolded, companies all over the world began competing for shares of the burgeoning MT industry and innovation led the way. In 2003, the future head of Translation Development at Google, Franz-Josef Och, developed a method for speeding up the MT process, leading to “Google Translate” in 2005. MOSES, the open-source statistical MT engine, was launched in 2007. The next year in Japan, a text/ SMS translation service was introduced for mobile devices, followed by the first mobile phone with built-in speech-to-speech translation functionality in 2009, offering services in Japanese, English and Chinese.

State-of-the-Art MT

As in indication of just how big a role MT has already come to play in everyday life, during 2012 Google Translate managed to transform enough text from one language to another to fill approximately one million books in a single day. In May 2013, Google added Bosnian, Cebuano, Hmong, Javanese and Marathi to its list of translatable written languages available on the Internet, bringing the total to 64. Additionally, an iOS app released by Google in 2011 now accepts voice input for 15 languages and allows written translation of words or phrases into one of more than 50 languages and translations spoken out loud in 23 different languages.

As impressive as those advances may be, perhaps even more inspiring is the story of a British inventor named Will Powell. In the summer of 2012, he demonstrated a mobile phone system that translates both sides of a conversation between headset-wearing English and Spanish speakers. It displays translated text on their mobile phone screens like subtitles in a foreign film.

Then, there is the new service recently launched by NTT DoCoMo, the largest mobile-phone operator in Japan. It translates phone calls between Japanese and English, Chinese or Korean as each party speaks consecutively. Eavesdropping computers translate their words in just seconds into male or female voices, as appropriate.

And when Microsoft’s chief research officer, Rick Rashid, spoke in English at a conference in Tianjin in October 2012, live MT was used to translate his presentation into Mandarin. It was delivered not only as subtitles projected on screens overhead, but also as a computer-generated voice that reflected the actual tones and inflections of Rashid’s own voice in the Chinese version.

Will Machines Replace Human Translators?

The Microsoft example is perhaps the cutting edge of current MT capabilities, employing highly sophisticated neural networks to mimic a person’s natural voice in a foreign language. But there is a lot more than voice recognition and replication standing in the way of the development of machine translation approaching the skills of true human translators.

For one, translation is not only about decoding the meaning of words in a source language and re-encoding that meaning in a target language. It also involves interpreting and analyzing all the features of the original words. That requires in-depth knowledge of the grammar, semantics, syntax, idioms, nuances and other aspects of the source language plus an understanding of the culture of the speaker or writer. A similar in-depth knowledge of the target language and culture is needed to re-encode the meaning.

Then, there is the problem of ambiguity. Swiss psychologist and U.N. translator Claude Piron (1931~2008) once complained that “machine translation, at its best, automates the easier part of a translator’s job; the harder and more time-consuming part usually involves doing extensive research.” As an example, he cited the phrase “Japanese prisoner of war camp.” It could be either an American camp with Japanese prisoners or a Japanese camp with American prisoners. The English carries both senses, so research would be needed to clarify what is meant.

That’s why, according to Piron, “a translator need(s) a whole workday to translate five pages, and not an hour or two…. About 90% of an average text corresponds to simple conditions. But unfortunately, there’s the other 10%. It’s that part that requires six [more] hours of work. There are ambiguities one has to resolve.”

Programmers are applying both linguistic rules and statistical models to enable machine translations to be more accurate. They are also working on ways to teach computers how to learn from mistakes and to truly “understand” language as opposed to merely manipulating it. Chances are, if a Universal Translator can be created, it will happen well within the next 200 years. In the meantime, human translators can still be relied upon to ensure that what’s communicated is what’s actually meant and not just what’s been written or said.

Add comment


Comments are closed.