Wednesday, April 02, 2008

FT.com / Technology - Translation technology: Language subtleties make full automation a myth

FT.com / Technology - Translation technology: Language subtleties make full automation a myth

Translation technology: Language subtleties make full automation a myth
By Alan Cane

Published: April 2 2008 02:23 | Last updated: April 2 2008 02:23

Machine translation has come a long way since the heady, early days of the digital computer, when it seemed only a matter of time before nation would speak unto nation through the intermediary of binary digits.

That this is taking longer to come about than originally hoped is a consequence of the complexity of natural language and the limited power of the machinery.

Even today, the most powerful supercomputers cannot guarantee a perfect translation of technical documents – and nobody is foolhardy enough to trust the intricacies of a novel or a poem to a machine.

“The myth of fully automated translation is just that – a myth,” says Mark Lancaster, chief executive of the UK company SDL, a leader in commercial technical translation. “Languages are just too complex for us to be able to automate the whole process.”

“Automated translation works well enough if you simply want to get an understanding of a document,” says Eric Blassin, head of technical development for Lionbridge, a US company that claims to be the world’s largest commercial translation services group.

This is described as a “gisting” system. It will give you the gist of a document, although at the risk of significant errors or loss of sense.

Automated translation can, however, save a translator time – he or she acts as a reviewer, correcting errors and mistranslations rather than working on the whole text from scratch.

There is a wide range of systems of varying degrees of power and flexibility available today to tackle technical translation.

At one end of the spectrum, there are gadgets such as a hand-held device from the US company, Franklin, priced at a modest £150 ($297), that will translate between 12 languages using 450,000 words and 12,000 phrases packed into its memory.

At the other end are the substantial enterprise systems from SDL or Lionbridge, capable of processing tens of millions of words every month for international customers. These two companies, leaders in their respective fields, present an interesting comparison in terms of business models.

Then there are mid-range systems from companies such as Systran – one of the pioneers of automated translation – that are suited to small and medium sized organisations. Systran technology powers online translation services such as Google Translate and Alta Vista’s Babelfish.

Gisting systems have to be treated with caution, however, as an unfortunate party of Israeli journalists found out last year. Invited to The Netherlands to interview the Dutch foreign minister, they were asked to submit their questions in advance in English.

The task, however, was left to the only non-English speaker among them, who used the online translation site http://www.babylon.com uncritically to create the text. The resulting nonsense came close to sparking an international incident and the visit was cancelled.

Why are translation services important? Companies hoping to succeed in global markets have, necessarily, to localise their products – that is, adapt them to local conditions and customs.

They can also contribute to matters of life and death. IBM, a leading researcher into translation systems, last year provided the coalition forces in Iraq with automatic translation devices and special software to recognise and translate more than 50,000 English and 100,000 Iraqi Arabic words.

The aim was to use the devices in hospitals and in training. The technology, known as Mastor, allows users to converse naturally producing audible and text translations of spoken words.

Mark Lancaster of SDL began his career as a software engineer working for the US companies Lotus Development Corporation and Ashton-Tate, now part of the Borland group.

“We were always having difficulty in getting our products into the global market quickly and efficiently. It took months or even years for Japanese double-byte type projects.” Double byte refers to the number of bytes required to code for a Japanese, Chinese or Korean character – English is a single-byte language.

Hardened by this experience, SDL was formed in 1992 to provide consultancy to US businesses hoping to break into global markets. Mr Lancaster swiftly saw a need for technology to help the translation process: “We created technology for translators. It was a sort of word processor: it helped with the translation in a word-processor-like environment.”

The company was providing both translation and localisation services. “We helped to project-manage the product. We would say to clients: ‘If you want to get this into 20 languages, you probably need to re-architect the product, so it supports accented characters, sort sequences and date formats’ – things specific to the markets they wanted to enter.”

Today, SDL is a quoted company providing translation services to customers using its in-house team of 700 translators working from 30 offices worldwide and up to 10,000 freelances.

Its principal activity, however, is licensing its translation management software to large corporate users such as Hewlett-Packard, Philips, Dell and Bosch.

There are two parts to the software: the translator’s workstation and the central language repository which grows continually as words and phrases are added: “The more they use it, the more they can leverage content that has previously been translated,” Mr Lancaster says.

He describes a typical translation process: “Someone writes, for example, a new product summary on a website. It is likely this will be created in a content management system which will alert our translation management system that there is new text on the website.

“It will take that piece of text and compare it with text in the multi-lingual repository. It will look for any text that has already been translated or is similar – something we call fuzzy matching – so that can be used again.

“It will never be completely automatic. For existing customers, we think that 50 per cent of new text can be matched to existing text, 10 to 20 per cent gives a partial match and the remaining 30 per cent has to be done from scratch.

“The trick is that we believe about 90 per cent of the professional translation community use our desktop technology. The joy is that when customers buy our enterprise technology, they know the translators, whoever they may be, can plug in seamlessly to the management system. This smooths the supply chain and saves everybody a lot of money.”

He quotes the example of the investment bank Morgan Stanley. SDL translates its research into several languages. Before the use of the technology it used to take up to five Morgan Stanley translators several months to convert a single document. Now the same document can be translated into four languages within 24 hours.

Lionbridge, now 12 years old and a spin-off from R.R. Donnelley, the large commercial printer, operates a different business model. It has developed central translation management software like SDL but for its own use: it provides only translation services.

Mr Blassin says: “We have developed a core platform [software system] that we use across all our vertical markets and all our customers. This enables us to recycle everything that has previously been translated.

“This platform is probably unique in that we were the first to introduce a pure, internet-based platform in this industry. Translators around the world are connected to the platform and translate online. They have only a small amount of software on their desktop machines and all the data stays in Lionbridge’s central system.

“This makes for consistency when many translators are working on the same project. The platform is partitioned and each customer has its own area.” Last year, 2,000 translators working online processed about 500m words using the Lionbridge software for companies such as Cisco, Du Pont and General Electric.

Lionbridge applies automation to the quality control process checking for accuracy, consistency and the elimination of “false friends” – words which look similar in two languages but which have entirely different meanings: a British stationery company, for example, tried to launch a non-leaking fountain pen in Spain with the angle that “it won’t leak in your pocket and embarrass you”. It assumed you the Spanish word “embarazar” means embarrassed. In fact, it means pregnant.

The aim of companies like SDL and Lionbridge is to automate as far as possible the entire life cycle of text production providing aids for the author, translation, intelligibility, review, compliance, desktop publishing and distribution.

Mark Lancaster says: “Companies’ websites are still mostly primitive and trivial. Updating them and company brochures and getting them out to customers can be expensive if you don’t have a system to manage it.”
Copyright The Financial Times Limited 2008

No comments: