文中強調「脈絡」對「語意」的重要性。請參看拙作《「意思」和「翻譯」 -- 兼評《哲學辭典》中譯本》。我沒有用過谷歌的語言翻譯機，無法評論它在中、英互譯過程的能力與成果。歡迎用過的朋友賜教。
The Surprising Reason Europe Came Together Against Putin
Claire Berlinski, 02/03/23
But there is another, less widely acknowledged source of Europe’s newfound unity: The latest version of Google Translate, which has turned the ancient dream of a world without language barriers into reality.
Jérôme Piodi, a French Eurocrat who has spent more than a decade in public administration in the European Parliament and in related Parisian ministries, said the key factor in making progress in Europe is a common understanding of complex ideas. “Until very recently, access to instantaneous translation of speech and ideas was reserved to a certain kind of elite — the kind who could spend money to pay translators,” Piodi said.
Europe has more than 200 native languages and mutually incomprehensible dialects. All of its 24 official languages are highly developed, each with its own media, textbooks, movies and language academies. These languages, and their use in schools, workplaces and families, define a country’s identity.
But we’re now living, for the first time, in an era where everyone in Europe — from politicians to cab drivers — can understand one another. It’s true that previously, diplomats could communicate through translators and, typically, in English. Now, ordinary Europeans can understand one another, instantly and accurately, and because of the compulsive lure of social media — and Twitter’s decision to automatically translate every tweet — Europeans can and do talk to each other all day long. Talking to Ukrainians, and hearing directly from them, has hardened public support for sanctions and weapons transfers in the EU, despite Russian threats and soaring energy prices. Eurobarometer polling shows that 74 percent of EU citizens back the bloc’s support for Kyiv.
This public support for Ukraine has translated into action. The West’s assistance to Ukraine has also been notable for the way Western politicians have responded to their citizens’ sentiment, rather than shaping it. At every stage, citizens have pushed their leaders to move faster and further. We’ve seen this recently in German Chancellor Olaf Scholz’s decision to send Leopard 2 tanks to Ukraine after an eternity of hesitation and dithering. He faced mounting public fury and protests, withering criticism and an outraged social media campaign to #FreeTheLeopards. In late January, Scholz relented and freed the Leopards.
Google Translate isn’t the complete explanation for the newfound European unity, of course, but it’s an underappreciated part of the story.
“It’s had a huge effect on people and their ability to share ideas on social media,” Piodi says. “Twitter is a small window on the world; Google Translate made the window bigger.”
While Peter Thiel lamented receiving 140 characters instead of flying cars, Google was working on a technological revolution that makes flying cars seem like the horse and buggy: high-quality machine translation. The audacity of its accomplishment has been curiously uncelebrated. It ranks with the mRNA platform upon which our Covid vaccinations were built as a great achievement of the 21st century, but it has mostly changed our world without applause. Few truly grasp the technological revolution that has transpired in the past several years.
Research into machine translation, inspired by the mathematician Claude Shannon’s work in information theory, among others, began in the 1950s. Early prototypes relied upon bilingual dictionaries and hand-coded rules. The results were garbled.
In 1964, the U.S. government established a commission to study machine translation. The commission declared the project hopeless: Human language was too subtle, complex, idiomatic, irregular and ambiguous for it to work. The Defense Department ceased funding research, and the technology stalled for decades.
Those early approaches foundered because researchers used a dead-end approach. They had envisioned machines learning language much the way humans learn second languages — by studying a grammar. They tried to analyze sentences in terms of the rules that governed them and translate them into a universal machine language, from which they could then be re-translated into the target language. The approach, called rule-based machine translation, or RBMT, failed because human language is indeed too subtle, complex, idiomatic, irregular and ambiguous for that to work.
With the growing power of processors and falling price of data storage, however, machine translation became a feasible target for the private sector. Google had ample resources for a project like this. Google’s early prototype, which debuted in 2006, was based on statistical machine translation, or SMT. SMT presumes that for each phrase, there are many possible translations, some more and some less likely to be correct. It works by searching a massive corpus of translated texts to see which translation is statistically most probable. The first Google Translate used phrase-based SMT — phrase-based, because it translates one phrase at a time, without considering the context of the phrase.
Such an engine can only be as good as the corpora of translated texts upon which it’s based. For this, Google used United Nations and European Parliament transcripts. The original version was popular, despite its deficiencies, and by 2016, it translated 140 billion words per day.
But while sheer processing power gave Google an edge over other SMT engines, it was still a primitive product. Characteristic was an infamous fiasco, in 2013, involving the English-language version of the Turkish daily Yeni Şafak and the old version of Google Translate. The newspaper decided to embroider an interview with Noam Chomsky with a few fabricated quotes suggesting his enthusiastic support for the Turkish government. (This is typical of Yeni Şafak, an Islamist paper known for fabrications and half-truths.) It ran these invented quotes through the old Google Translate, then published these immortal lines: “This complexity in the Middle East, do you think the Western states flapping because of this chaos? Contrary to what happens when everything that milk port, enters the work order, then begins to bustle in the West. I’ve seen the plans works.”
“Milkport” — from the Turkish süt liman, an idiom akin to “smooth sailing” — became Turkish shorthand for an amalgam of ludicrous machine translation and fake news.
Improvements in quality had stalled.
The revolution came in 2016, when Google introduced digital neural networks, modeled on the way learning takes place, we think, in the human brain. A Neural Machine Translation (NMT) model uses neural networks to study the relationship between the source and target languages by processing massive amounts of parallel text data. It learns from the data and improves the translations by adjusting the weights of the neurons. Unlike its predecessor, it isn’t phrase-based. In NMT, words or parts of words are converted into numerical representations called “word vectors.” These contain information not only about the meaning of the word (單字的意義), but its context (單字被使用的脈絡). So “milk,” for example, no longer merely represents a word that may be translated as leche, Milch, or молоко. It represents all the information the model has about how humans use that word.
Google formally launched its NMT model for Google Translate in November 2016. It did so discreetly and with little fanfare. By the next day, it had shown improvements equal to the total gains the old system had shown over its lifetime. It continues to learn at this speed. The results, now in more than 109 languages, are astonishing. Mother-tongue language speakers asked to rate Google’s translations on a scale from 0 to 6 offer an average rating of 5.43.
It’s not entirely free of error, of course. At times — especially when the original text is highly idiomatic, misspelled or full of shorthand — the translations are imperfect. But they’re almost always good enough that you can get the gist (搞懂原文的要點). The machine model can also be rigged to provide deliberate mistranslations: For a time, for example, it automatically converted “Russian Federation” to “Mordor,” “Russians” to “occupiers,” and the name of Russia’s foreign minister, Sergey Lavrov, to “sad little horse.” But Google Translate is used by too many people, daily, for fraud to be sustained.
In 2019, the Annals of Internal Medicine published a study pronouncing Google Translate so accurate that it could be used to translate the results of medical trials — a task where an error could have deadly consequences. Professional translators hate it. Of course they do: It’s putting them out of work. They’re prone to writing articles insisting that Google doesn’t translate properly. It’s true that for literary nuance, you want a human translator. But for everyday translation — in medicine, in courts, in diplomacy, even — Google Translate often does the job as well as a professional and does it faster, for free. Most participants in translation Turing tests are unable to distinguish its translations from a human’s.
Although these advances were astonishing, it was perhaps unsurprising that many people didn’t realize it had happened at all. If you’re an English speaker, your search engine will serve you English, not foreign-language results. (Google earns money by selling advertising, and you’re not likely, if you live in Milwaukee, to do your shopping in Budapest.) Unless you traveled to foreign countries frequently, Google Translate likely wouldn’t be a daily part of your life.
The new technology’s relatively low profile changed by late 2020, when Twitter integrated the new Google Translate into its platform, replacing the comparatively primitive Bing translation service, which no one liked. From then on, every single tweet on the platform was translated automatically into the user’s native tongue.
This, says Piodi, was the “almost perfect combo, with high [internet] connectivity in most of Europe allowing citizens in Paris, London, Kyiv or Stockholm to (almost) have an immediate understanding of the others.” Twitter integrated the translation engine seamlessly. You didn’t need to sign up, opt in or laboriously copy-and-paste. Suddenly, the whole community of Twitter users could read everyone else’s tweets, no matter what language they were written in. Twitter became multilingual, with people following foreign language accounts and replying to them in their native language, knowing their response would be translated automatically.
Other social media platforms have incorporated Google Translate, too, but Twitter plays a unique role in the social media ecosystem because it’s entirely text-based and because accounts on Twitter are interlinked in a way that makes it ideal for rapid news diffusion and debate. Unlike Facebook or Instagram, Twitter’s primary function isn’t the maintenance or expansion of personal contacts, but the dissemination (傳播) of news and information. This is why journalists, politicians, NGOs and PR companies are disproportionately represented on Twitter — and why it has outsized political influence. This structure and user base makes Twitter an ideal venue for testing slogans, debunking lies, reproaching politicians and winning converts.
本文於 修改第 1 次