arvin-ash 1 jaar geleden

So How Does ChatGPT really work? Behind the screen!

ChatGPT is an intelligent chatbot that uses natural language processing. The GPT stands for Generative Pre-trained Transformer, which means it generates responses, it is pre-trained by humans, and it transforms input data into an output. This model was created by an artificial intelligence research company called OpenAI.

ChatGPT's power is the ability to interpret the context and meaning of a query and produce a relevant answer in grammatically correct and natural language, based on the information that it has been trained on.  


It uses neural networking, with supervised learning and reinforcement learning, two key components of modern machine learning.


What it does fundamentally is predict what words, phrases and sentences are likely to be associated with the input made. It then chooses the words and sentences that it deems most likely to be associated with the input.


So it attempts to understand your prompt and then output words and sentences that it predicts will best answer your question, based on the data it was trained on.


It also randomizes some outputs so that the answers you get for the same input, will often be different.


How ChatGPT fundamentally works, is that it tries to determine what words would most likely be expected after having learned how your input compares to words written on billions of webpages, books, and other data that it has been trained on. 


But it’s not like the predictive text on your phone that’s just guessing what the word will be based on the letters it sees.


ChatGPT attempts to create fully coherent sentences as a response to any input. And it doesn’t just stop at the sentence level. It’s generating sentences and even paragraphs that could follow your input. 


If you ask it complete this sentence, “Quantum mechanics is…” -- The processing that happens behind the scenes goes something like this: It calculates from all the instances of this text, what word comes next, and at what fraction of the time. It doesn’t look literally at text, but it looks for matches in context and meaning. 


The end result is that it produces a ranked list of words that might follow, together with their “probabilities.” So it’s calculations might produce something like this for the next word that would follow after the word “is”:


a 4.5%

based 3.8%

fundamentally 3.5%

described 3.2%

many 0.7%

It chooses the next word based on this tanking.


But the sentence completion model is not enough, because you might ask it to do something where that strategy might not be appropriate. 


In the first stage of the training process, Human contractors play the role of both a user and the ideal chatbot. Each training consists of a conversation with the goal of training the model to have human-like conversations.  


Through this supervised human-taught process, it learns to come up with an output that is more than just sentence completion. It learns patterns about the context and meaning of various inputs so that it can respond appropriately.


But human training has scale limitations. Human trainers could not possibly anticipate all the questions that could ever be asked. For this it uses a third step which is called reinforcement learning.


This is a type of unsupervised learning. This process trains the model where no specific output is associated with any given input. 


Instead the model is trained to learn the underlying context and patterns in the input data based on its earlier human-taught pretraining. 


This way the model can process a huge amount of data from various sources, and learn the patterns from texts and sentences of a near limitless number of subjects.


The dataset used to train ChatGPT which is based on GPT-3.5 is about 45 terabytes of data.


00:00 | What is ChatGPT?

01:33 | Magellan offer

02:31 | How ChatGPT differs from Google

04:26 | Overview of how ChatGPT works

07:07 | Simple example of what happens behind the scenes

09:45 | Beyond sentence completion 

10:21 | Three stages of pre-training process

13:24 | The huge dataset used

Arvin Ash
arvin-ash