Yandex published YaLM 100B – now it is the largest GPT-like neural network in the public domain

Posted by

free software

Yandex has made publicly available YaLM 100B, a neural network for generating and processing texts in Russian and English. This is the largest GPT-like model published in the public domain. Now developers and researchers from all over the world can use it. This was reported to CNews by representatives of Yandex.

YaLM 100B contains 100 billion parameters – more than any of the existing models for the Russian language. This allows it to be used to solve a wide range of problems related to natural language processing. Language models from the YaLM family determine the principle of text construction and generate new ones based on the laws of linguistics and their knowledge of the world. For example, they are able to come up with ideas for advertising campaigns, create product descriptions and videos. With their help, you can generate any texts (poems, answers, congratulations, and so on), as well as classify them, for example, according to the style of speech.

The Yandex team uses YaLM neural networks in more than 20 projects, including Poisk and the Alice voice assistant. Language models help support staff respond to requests, generate advertisements and site descriptions (snippets). YaLM neural networks are also widely used in the preparation of quick answers in the Search.

“To train such a large language model requires huge resources, experienced specialists and years of work. And it is important for us that not only the largest IT companies have access to modern technologies, but the entire community of researchers and developers. By making YaLM 100B available to the public, we expect that this will give impetus to the development of generative neural networks, ”said Petr PopovCEO of Yandex Technologies.

4 Branch Office IT Challenges and How to Solve Them


The model was trained on Yandex supercomputers, which were recognized as the most powerful in Eastern Europe. During the training, YaLM 100B processed about 2 TB of texts from open datasets and the Internet in English and Russian. The model is provided under the Apache 2.0 open license and is available on GitHub.

Source link

Leave a Reply

Your email address will not be published.