Free email course on building IoT with Cloud IoT CoreCloud IoT Core
Free email course on  
Connectivity in IoT
Thanks for signing up! You'll get the first email soon.
Oops! Something went wrong while submitting the form.
< Back to Blog

Back in December of 2016, the New York Times published an article on the Google Brain team and how neural networks have elevated the accuracy of Google Translate to a human-like level. Still, these systems based on complex architectures involving recurrent neural networks (RNN) and convolutional neural networks (CNN) were computationally expensive and were limited by their sequential nature.

To illustrate, take the following sentences where the word “bank” has a different meaning in context:

“I arrived at the bank after crossing the road.”

“I arrived at the bank after crossing the river.”

Humans can quickly determine that the meaning of “bank” depends on “road” or “river.” Machines using RNNs, however, process the sentence sequentially, reading every word by word, before establishing a relationship between “bank” and “road” or “river.”

This becomes a bottleneck for the computer running on TPUs and GPUs that utilize parallel computing. Although CNNs are less sequential than RNNs, computational steps to establish the meaningful relationships grow with increasing distance. Thus, both systems struggle with sentences like those presented above. 

Both Transformer and DeepL overcome this limitation by applying an attention mechanism to model relationships between words in a sentence more efficiently. While the actual implementation of each of the systems is different, the underlying principle spans from the paper, “Attention Is All You Need.”

For a given word, attention-based systems quickly compute an “attention score” to model the influence each word has on another. The key difference is that attention-based systems only perform a small number of these computations and feeds the weighted average of the scores to generate a representation of the word in question.

This mechanism is applied to translation in the following way. Previous approaches to machine translation had a decoder create a representation of each word (unfilled circles below) and using the decoder to generate the translated result using that information. Attention-based systems add on the attention scores to the initial representations (filled circles below) in parallel to all the words with the decoder acting similarly.

Attention scores in machine translation
Image Credit: Google Blog

Attention scores also allow the machine to visualize the relationship between the words. Take another coreference resolution example:

“The animal didn’t cross the street because it was too tired.”

“The animal didn’t cross the street because it was too wide.”

When translating this sentence to other languages where “it” can hold gender forms depending on its reference to “animal” or “street,” encoding the sentence with the correct relationship is vital. Attention scores can generate a visualization embedding this information in the following manner:

Attention score machine translation
Image Credit: Google Blog

In the first sentence, the system picked up that “it” refers to the animal, whereas in the second, “it” refers to the street.

The implication of attention-based systems in the machine translation world is huge. The machine can now pick up nuanced information and translate sentences more naturally. From DeepL’s own blind testing, attention-based systems outperformed existing translators with fewer tense, intent, and reference errors:

DeepL Machine Translation
Image Credit: Google Blog

You can try out Transformer using the tensor2tensor library from Google and the DeepL translator online via their website.

Have Questions? Talk to an Expert

Yitaek Hwang

From traveling the world solving vision issues in underserved regions through ViFlex to building software to diagnose autism using machine learning, I realized that I like building things. So currently I’m on a path to build an Internet of Things (IoT) platform at Leverege as a Venture for America Fellow.

SHARE

Liked this post? You're gonna love these!

Leveraging Computer Vision for Asset Tracking Solutions
November 28, 2018
Top 3 Best Practices for Designing Geofencing Applications
November 16, 2018
Market Research for Indoor Asset Tracking Solutions
November 15, 2018

Talk with Leverege

What type of use case are you building for? Whichever it is we are looking forward to learning more about your needs.

Have Questions?

Our team of experts is here to help!

Thanks for your submission! Our team is looking forward to connecting with you and will be in touch very soon!
Oops! Something went wrong while submitting the form.