Ideas

Bringing down the language barrier

By Mike Schroepfer
October 23, 2020

For billions of people around the world, language remains a major barrier to getting the most out of the internet. All the stuff that people like me take for granted when we go online — the ability to learn new things, or be entertained, or participate in global communities — simply doesn’t work very well for those who don’t speak one of the handful of languages that dominate the internet.

Recommended Reading

Universal translation, from any language to any other, would change all of that. It’s hard to imagine a technological development that would be more transformative for more people, immediately breaking down barriers and creating opportunities on an enormous scale. It’s no coincidence that such technology is a core part of so much great sci-fi, from Star Trek to The Hitchhiker’s Guide to the Galaxy. And while we’re not there yet, I think people underestimate just how much progress has been made in the last 8 years. 

In fact, in the decades I’ve been working in technology, there are only a few examples I can think of where so much progress has happened in such a short amount of time. There’s a lot about AI that gets overhyped, especially when people assume that progress in one area easily transfers to another. But in areas like machine translation and multilingual understanding, the progress has been truly stunning — and with so much more to come.

Our AI team released an incredible new step in this direction just this week: an AI model that has been trained to work with 100 languages, and can translate directly between any pair of them. Typical translation systems are centered on English — they often translate first into English, and then from that into the third language, or are trained to learn languages based on their relationship to English. By removing that middle step, we’re not just making the whole system much more efficient — we’re making it significantly more accurate, reducing the game-of-telephone errors that emerge when you’re working with a translation of a translation. 

And we’re putting languages that have historically been overlooked by machine translation research at the center of our systems. If a Bengali speaker in Bangladesh wants to communicate with a Tamil speaker in Sri Lanka, this new system, named M2M-100, understands both their languages and translates directly, with no detour via English. And the whole thing has been made available as an open source project, free for all to build on and experiment with. 

This is just one of many new translation capabilities that AI work at Facebook and elsewhere is enabling. Some of them are already operational, improving the accuracy of the 20 billion translations we already do every single day. Some are advances that are yet to be deployed, but they soon will be. And others are part of a long and extremely promising R&D pipeline that will keep producing major advances for years to come.

These advances will be great for people like me, letting us experience the richness of cultures across the world in ways previous generations could never have imagined. But they will have a much, much bigger impact elsewhere: universal translation will be truly life changing for people currently held back from the online world by language barriers. Those barriers are going to fall, and witnessing the creation of the technology that will bring them down is incredibly exciting.