Yandex launched simultaneous translation of live broadcasts in the browser

Posted by

Web services

Yandex has added a unique technology for automatic voice-over translation of live broadcasts on YouTube to its browser. Announcements of long-awaited innovations in the IT industry, interviews with celebrities and online broadcasts of space launches will now be available in Russian in real time. While the technology is in open beta testing, it is used to translate broadcasts on a limited number of YouTube channels. This was reported to CNews by representatives of Yandex.

You no longer have to wait for the translation video to appear in the recording after the end of the event. Live broadcasts can be watched immediately in Russian online: webinars, business conferences and presentations of large technology companies – for example, Apple’s autumn presentation. Automatic voice-over translation of live broadcasts complements the technology of voice translation of video and interactive subtitles.

“We continue to improve technologies that help blur language boundaries on the Internet. For example, in 10 months, Yandex Browser users watched 81 million videos with voiceovers. Then we took up the subtitles, and now it’s time to translate the live. The next step is the translation of streaming broadcasts and videos not only on YouTube, but also, for example, on Twitch. We taught neural networks to translate broadcasts in English, German, French, Italian and Spanish. Next, we will add new pairs of European languages, as well as Chinese, Japanese and others,” said the head of the Yandex and Yandex Browser application. Dmitry Timko.

Voice-over translation of streaming video is an incredibly difficult engineering task. On the one hand, context is very important for high-quality translation of foreign speech, since in different situations the same word can have different meanings – for this it is desirable to “give” as much text as possible to the neural network at a time. However, when working with a streaming scenario, the minimum delay is important, which means that you need to translate instantly – there is simply no time to wait until the speaker finishes formulating a detailed thought. The neural networks act as a simultaneous interpreter, who starts translating the sentence before it is finished.

Is it worth trying to improve operational efficiency with the help of digital audit: the experience of Alrosa


For fast and high-quality work in a streaming scenario, it was necessary to rebuild the entire architecture of video offscreen translation. In the case of recorded videos, the neural network receives the entire audio track, which means it has the full context, which makes the task easier. Working with the translation of a live broadcast is built in a completely different way: one neural network recognizes audio and turns it into text literally on the fly, the other determines the gender of the speaker by biometrics. But the hardest part is the next step. The third neural network places punctuation marks and extracts semantic fragments from the text – parts that contain a complete thought. It is they who are taken by another neural network responsible for the translation, which is immediately synthesized in Russian.

Any user of Yandex Browser on a computer can test the technology: off-screen translation of streams is now available for a specific list of YouTube channels that broadcast live.

Source link

Leave a Reply

Your email address will not be published.