How are LLMs Trained?
LLM may sound complicated, but did you know they are not that "difficult" to create? In fact, our everyday life is an LLM of its own. We have friends, stories, books, food and things in our live that we reference each day.
Training an LLM.
- Learning the Language: The first step in training an LLM is to feed the robot with whatever we want to trim it on. We begin by having the the robot reads lot of books, websites, and other texts to learn how people talk and write. By doing this, it learns words, sentences, and the meanings behind them, how they are used and how people think.
- Big Brain (Neural Network): Each robot being trained has what is called Neural Network ("big brain" Think of it like a giant brain made up of lots of tiny connected parts. Just as in humans, this "big brain" is what helps the LLM (robot) remember what it has been trained on so when we ask for that information at a later time, it simply gives us the answer.
- Understanding Context: When you ask the robot a question, it uses its big brain to think about what you said. It looks at each word and tries to understand the whole sentence, just like you do when you read a story.
- Finding Answers: After understanding your question, the robot searches through all the information it learned. It looks for the best answer or the most likely response based on what it knows.
- Talking Back: Finally, the robot uses its language skills to put together a response. It tries to make sure the answer makes sense and sounds like something a real person would say.
LLM may sound complicated, but did you know they are not that "difficult" to create? In fact, our everyday life is an LLM of its own. We have friends, stories, books, food and things in our live that we reference each day. We have also trained our brains to remember and associate this thing based on their relevance and importance to us. The main difference in this analogy and a computer based LLM is that, the computer one has Billions and even Trillions of information that it has stored. The “Large” in LLM, (Large Language Model) got its name from the size of information, called “parameters they contain. The more the information a model holds, the “smarter” it becomes.
Just as humans, to train an LLM, scientists begin by feeding the model huge amounts of information. This information is fed from books, websites, articles, picture, Videos and so on. To make sure the model remembers, they are shown the same information over and over again. Using Neural networks, the model stores these trainings just as we humans remember faces we have previously seen.
So to simplify and conceptualize LLM, views it as a “tape recorder” that plays back a song that has been pre-recorded.