The role of machine translation in Europe’s digital single market strategy
Friday, 15 May 2015
On January 7 1954, a number of simple sentences in Russian were translated to English by a computer for the very first time. Ever since the achievements by the programmers behind IBMs famous “Computer 701”, researchers, scientists and language experts have tried to perfect mechanisms behind machine translation.
With improved technology and increased data that the internet has provided us, efforts have been made to “crack the code” of automatic flawless translations. In Europe and Brussels, the focus is on the digital single market strategy. With a European Union that spends more money than any other economic union on translations, there is pressure to put focus on more research in order to come up with new technological language solutions for its European internal market.
The Brussels Times met with Andrejs Vasiljevs, CEO at Riga based Tilde, which is at the forefront of R&D in Europe within translation technology.
Andrejs Vasiljevs, CEO at Tilde
Q: EU institutions spend around 1Bn EUR per year on translation and interpretation in order to maintain their policy of multilingualism. Some call it the expensive “Tower of Babel”. How can we cope with the additional costs that language barriers in the EU present while at the same time remain competitive against single language trading blocs such as the US or China?
A: I agree completely. Europe has done fantastic progress when it comes to the free movement of people, goods, capital and services in Europe. However, language barriers remain. It is good that services are freely available between EU member states, but in reality people don’t like to consume services if it is not in their own language or a language that they feel very comfortable with. As a result, European SMEs don’t scale up. Our companies and citizens are at loss and ultimately the consequence is of course that it puts the European economic competitiveness at a disadvantage compared to the US and China. Under the so called European digital market strategy agenda, several reports estimate the sums above €300 billion in additional annual economic growth in savings and increased productivity can be expected if we implement it properly.
Q: What are the goals of a multilingual digital Europe and how can we achieve them?
The goal is to make multilingualism a priority and include it in EU’s digital single market strategy. In practice, we envision to create a platform for quality machine translation for the benefit of all European citizens and companies.
So far, the EU has invested a lot on physical infrastructure. Roads, energy and agriculture have attracted large sums of EU money. It is time to invest on virtual infrastructure such as language technologies. Currently it is on a consultation process. During the Riga Summit 2015, we discussed and debated this issue and the consensus was that it should be included in the EUs agenda.
Q: Sixty years ago, scientists at Georgetown and IBM lauded their new translation machine “the brain”. It had successfully translated a number of sentences from Russian into English, leading researchers to confidently claim that translations could be fully handled by machines in “the next few years”. In hindsight, they were too optimistic. What has now changed that we can say that machine translation can soon become fully automated?
Back then, it was believed that we could describe languages by rules and “crack the code” through algorithms. It turned out that this was only possible for very simple sentences. Languages are however too complicated for that. Since then, we have had new developments in machine learning.
The IBM 701, “The Brain”
It is called the “statistical approach” and looks at how phrases can be translated in different contexts. The statistical approach towards machine learning was developed in research circles in the early 90s and the first industrial use took place in the mid-2000s. The idea is that the more data that you have, the more information the computer has at its disposal to improve itself and the translations. In fact, this is how Google use their data. They discovered that they could use all the data on their pages to develop a translation tool. They went about to find brilliant researchers and programmers in Europe to use their vast data and develop Google Translate. It is not perfect however, and the translations are complicated for smaller languages where we don’t have enough data, as well as for grammatically complex languages such as German to English translations.
Q: Can you tell us a bit about the work of Tilde and where your technology is applied? Who are your clients and end users?
We do a lot of work on machine translation for smaller languages in the Baltic, Scandinavian and Eastern European regions. We have created a platform to customize and adapt the translations for different user groups. To do this, we have created a large database with terminologies within different specialized fields (e.g. medical, agricultural etc). Our technology has also been used in “E-Government”, and for the presidency of Latvia in the EU, we developed hugo.lv. Professional translators can use our platform as a primary tool to translate text and later go through it with post editing.
Q: Where do we go from here? What are the next milestones?
Our vision is that it will be easier for everyone to plug in and access good translation for free through a common open platform facilitated by the European Commission.
Today, people will not buy or be interested in a company or online shop if they do not understand the language. If you take a look at Ebay for example, when they provided services in Russian, the translations weren’t perfect but it increased their sales significantly. The problem is that only large companies can afford good translation. We would like everyone to have the opportunity to scale up their reach for the benefit of companies, citizens and the European economy at large.
Some “so called” experts argue that we need to wait until we have a perfect technology for machine translation. However, even human translators aren’t perfect. We should start with what we have, invest more and get closer to our goal and the full potential of the European digital agenda.