Revenue Stream Development Specialist

Explore discuss data innovations to drive business efficiency forward.
Post Reply
rifat28dddd
Posts: 706
Joined: Fri Dec 27, 2024 12:30 pm

Revenue Stream Development Specialist

Post by rifat28dddd »

For example, the following languages ​​with large populations contribute less than 10% of the Internet text content, making it difficult to collect enough data to train a big language model specifically for this language: Hindi 100 million speakers Arabic 100 million speakers Bengali 100 million speakers Urdu 100 million speakers The difference between language speakers and available text data leads to an imbalance in language diversity. The source of this problem is more about the development and investment of the country, which we will elaborate on in the next blog post. This is also a fundamental challenge for big language models that aim to support a wider range of languages.


If a language has only a small amount of text on hungary whatsapp phone number the Internet, there is no big language model suitable for this language. If a language has a large amount of text on the Internet, it will also require the country representing it to increase investment to develop a big language model with its own language characteristics. So I categorized the world's languages ​​by the level of support - support for high-resource and low-resource languages ​​English is the most effective "programming language" for large language models Large language models have an input and output limit expressed in k numbers. If k is too small, such as only , then what can be done is very limited.


This is a bit like early personal computers with only K memory and cannot run "large programs". Nowadays, some smartphones have 1000s of memory. As for how many English words or Chinese characters are in k, we will explain later. -- and the length of k of language models has been growing. As of , up to K k are supported. Here K stands for thousands of K. K is thousands of k. How to write prompts for large language models elegantly and economically has become a craft. As of , the model and its context length limit Giving instructions to large language models is a bit like typing instructions into early computers.
Post Reply