How does ChatGPT work?
We do a deep dive into the inner workings of the popular AI chatbot, ChatGPT. If you want to know how their generative AI magic happens, read on.
Google, Wolfram Alpha, and ChatGPT interact with users through a single-line text input field and provide text results. A collection of articles and web pages that (ideally) answer the search queries are returned by Google as search results. Wolfram Alpha often provides mathematical and data analysis-related answers.
ChatGPT, on the other hand, provides an answer based on the context and intent behind the user’s question. For instance, ChatGPT can compose stories and code modules, although you cannot ask Wolfram Alpha or Google to do so.
Fundamentally, the power of Google is its ability to do massive database searches and provide a series of matches. The strength of Wolfram Alpha lies in its capacity to examine data-related queries and do computations in response to such queries. The strength of ChatGPT is its capacity to parse inquiries and generate comprehensive answers and results based on the vast majority of text-based data that is digitally available around the globe – at least information that existed at the time of training before 2021.
In this post, we’ll examine ChatGPT’s capability to provide these thorough answers. We’ll start by looking at the key phases of ChatGPT’s operation, and then we’ll cover some of the key components of the AI architecture that make it all work.
I utilized ChatGPT itself to assist in addition to the materials given in this post, many of which are the original research articles behind each of the technologies. I create this backgrounder. I asked many questions. Some answers are paraphrased within the general context of this discussion.
The two main phases of the ChatGPT operation
Let’s use Google as an analogy again. When you ask Google to look something up, you probably know that it doesn’t – the moment you ask – go out and scour the entire web for answers. Google instead looks through its database for sites that correspond to that request. The search and data gathering phase and the user interaction/search phase are Google’s two primary phases, respectively.
Roughly speaking, ChatGPT works the same way. Pre-training refers to the period of data gathering, and inference refers to the phase of user responsiveness. The simple pre-training procedure has surprisingly shown to be very scalable, which is the key to generative AI’s success and the reason it has suddenly become so well-liked.
Generally speaking (because going into detail would take too long), AIs pre-train using two main approaches: supervised and unsupervised. The supervised technique has been employed for the majority of AI projects up until the current generation of generative AI systems like ChatGPT.
A model is trained on a labeled dataset using supervised pre-training, where each input is linked to a matching output.
A dataset of customer service dialogues, for instance, may be used to train an AI. In this dataset, user queries and complaints are labeled with the relevant customer care representative replies.
Questions like “How do I reset my password?” are used to train the AI.?” would be provided as user input and responses such as “You can reset your password by visiting the account settings page on our website and following the instructions”. would be provided as output.
In a supervised training approach, the general model is trained to learn a mapping function that can accurately map inputs to outputs. supervised learning applications including classification, regression, and sequence labeling frequently employ this method.
As you may expect, there are restrictions on how this can scale. Human trainers would have to go to great lengths to anticipate all inputs and outputs. Training can take a long time and be limited in subject matter expertise.
But as we found out, ChatGPT has very few limits in terms of subject matter expertise. You can ask him to write a resume for Chief Miles O’Brien from Star Trek, explain quantum physics, write code, write short fiction, and compare the styles of government of former presidents of the United States.
There is virtually no way ChatGPT could have been trained with a supervised model because it would be impossible to foresee every question that would be asked. Instead, ChatGPT uses unsupervised pre-training – and that changes the game.
Unsupervised pre-training is the process by which a model is trained on data where no specific output is associated with each input. Instead, the model is trained without any particular objective in mind to discover the underlying structure and patterns in the incoming data.
Unsupervised learning tasks including clustering, anomaly detection, and dimensionality reduction frequently employ this method. Unsupervised pretraining can be used in language modeling to train a model to comprehend natural language syntax and semantics to produce coherent and intelligible text in a conversational situation.
This is where ChatGPT’s seemingly limitless knowledge becomes possible. Developers only need to keep adding data to ChatGPT’s pre-training process because they don’t need to know the outputs that result from the inputs. is called transformer-based language modeling.
Transformer architecture is a type of neural network used to process natural language data. A neural network simulates the way a human brain works, processing information through layers of interconnected nodes. Think of a neural network as a hockey team: each player has a role, but they pass the puck back and forth between players with specific roles, all working together to score the goal.
When formulating predictions, the transformer architecture analyses word sequences using “self-attention” to assess the significance of individual words. Self-attention is similar to the way a reader might look at a previous sentence or paragraph for the context needed to understand a new word in a book. To comprehend the context and links between words, the transformer examines each word in a sequence.
Each of the layers that make up the transformer has multiple sublayers. The self-attention layer and the feedforward layer are the two primary sublayers. The feedforward layer transforms the input data while the auto-attention layer determines the significance of each word in the sequence. The associations between the words in a sequence are learned and understood by the transformer thanks to these layers.
During training, the transformer receives input such as a sentence, and is asked to make a prediction based on that input. The model updates based on how well its prediction matches the actual output. Through this process, the transformer learns to understand the context and relationships between the words in a string, making it a powerful tool for natural language processing tasks such as language translation and text generation.
Before we look into the user interaction stage of ChatGPT and natural language, let’s talk about the data that is provided to it.
ChatGPT training datasets
The dataset used to train ChatGPT is huge. ChatGPT is based on the GPT-3 (Generative Pre-trained Transformer 3) architecture. Now, the abbreviation GPT makes sense, doesn’t it? It’s generative, which means it generates results, it’s pre-trained, which means it builds on all the data it ingests, and it uses a transformative architecture that evaluates text inputs to understand the context.
The WebText2 dataset, a collection of more than 45 terabytes of text data, served as the training ground for GPT-3. When you can buy a 16-terabyte hard drive for less than $300, a 45-terabyte corpus might not seem that big. But text takes up much less storage space than photos or videos.
This massive amount of data has allowed ChatGPT to learn patterns and relationships between natural language words and phrases on an unprecedented scale, which is one of the reasons it is so good at producing logical and pertinent contextual solutions to user questions.
Although ChatGPT is based on the GPT-3 architecture, it has been tuned on a different dataset and optimized for conversational use cases. This allows you to provide a more personalized and engaging experience for users who interact with it through a chat interface.
For example, OpenAI (developers of ChatGPT) released a dataset called Persona-Chat, specifically designed to train conversational AI models like ChatGPT. Over 160,000 talks between two people are included in this dataset, and each participant has a distinct persona that sums up their history, hobbies, and personality. As a result, ChatGPT may learn to produce replies that are tailored to the individual and conversational setting.
Numerous more conversational datasets were utilized to fine-tune ChatGPT in addition to Persona-Chat. Here are a few instances:
Cornell Movie Dialogs Corpus: a dataset containing conversations between characters in movie scripts. It includes over 200,000 conversation exchanges between over 10,000 movie character pairs, covering a wide range of topics and genres.
Ubuntu Dialogue Corpus: A collection of dialogs between users seeking technical support and the Ubuntu community support team. It contains over 1 million dialogs, making it one of the largest publicly available datasets for research on dialog systems.
DailyDialog: A collection of human-to-human dialogues on a variety of topics, ranging from everyday life conversations to discussions of social issues. Each dialogue in the dataset has several turns and is labeled with details about the topic, sentiment, and other emotions.
In addition to these datasets, a sizable quantity of unstructured data from the Internet, such as webpages, books, and other text sources, was used to train ChatGPT.
This allowed ChatGPT to learn about language structure and patterns in a more general sense, which could then be tuned for specific applications such as dialog management or sentiment analysis.
With minor modifications in architecture and training data, ChatGPT is a unique model that was developed using a methodology similar to that used to train the GPT series. ChatGPT has 1.5 billion parameters, which is less than the 175 billion parameters of GPT-3.
Generally speaking, the training data used to fine-tune ChatGPT is conversational and deliberately chosen to include human-to-human conversations, allowing ChatGPT to learn how to produce natural and interesting replies conversationally.
Consider the unsupervised training of ChatGPT as having been given a large amount of data and left to discover patterns and make sense of it all. This approach has made it possible for new generative AI systems to expand so swiftly.
Pre-training does the bulk of the work for ChatGPT’s generative AI, but it also has to be able to comprehend the queries and create replies using all of this information.
This is done by the inference phase, which consists of natural language processing and dialog management.
What about human involvement in pre-training?
Despite the unsupervised pretraining method’s inherent scalability, there is some indication that human intervention may have been necessary to get ChatGPT ready for usage by the general public.
The big reveal was in an article in TIME magazine that discussed the use of human “data labelers” who earn between $1.32 and $2 an hour in Kenya. According to the TIME report, it was the responsibility of these workers to scan the internet’s gruesome and sexually explicit content to flag it for ChatGPT training.
Another source claims that the huge language model was trained using a method known as Human Feedback Reinforcement Learning (RLHF), which was posted in the AI weekly Martechpost.
According to the article, “The training process involved an initial model tuned using supervised learning, with human trainers playing the role of user and AI assistant.”
There are some subtleties here when it comes to the meaning of the word “training”. According to ChatGPT itself, “OpenAI did not use reinforcement learning with human feedback to train me. Instead, I was pre-trained using a combination of unsupervised and supervised learning techniques such as language modeling, automatic coding, and sequence prediction. My training involved processing large amounts of text data from the internet, which allowed me to learn patterns and relationships between words and sentences.”
The AI continued, “However, once I have been pre-trained, researchers and developers can use reinforcement learning with human feedback to tune me for specific tasks or domains, such as answering questions or generating text. In those cases, humans can provide feedback in the form of rewards or penalties, which can be used to update my parameters and improve my performance on similar tasks in the future.”
This seems to fit with the Marktechpost and TIME reports, as the initial pre-training was unsupervised, allowing an enormous amount of data to be fed into the system. But in building dialog responses that communicate with users (more on that below), the response engines were trained on the types of responses and trained to filter out inappropriate material – and that training appears to have been assisted by humans.
natural language processing
Natural language processing (NLP) focuses on enabling computers to understand, interpret, and generate human language. With the exponential growth of digital data and the increasing use of natural language interfaces, NLP has become a crucial technology for many businesses.
NLP technologies can be used for a wide range of applications including sentiment analysis, chatbots, speech recognition, and translation. By leveraging NLP, businesses can automate tasks, improve customer service and gain valuable insights from customer comments and social media posts.
One of the main challenges in implementing NLP is dealing with the complexity and ambiguity of human language. NLP algorithms need to be trained on large amounts of data to recognize patterns and learn the nuances of the language. They also need to be continually refined and updated to keep up with changes in language context and usage.
The technology works by breaking language inputs, such as sentences or paragraphs, into smaller components and analyzing their meanings and relationships to generate insights or answers. NLP technologies use a combination of techniques, including statistical modeling, machine learning, and deep learning, to recognize patterns and learn from large amounts of data to accurately interpret and generate language.
You may have noticed that ChatGPT may ask follow-up questions to clarify your intent or better understand your needs and provide you with personalized responses that take into account the entire history of the conversation.
This is how ChatGPT can have multi-turn conversations with users naturally and engagingly. It involves using algorithms and machine learning techniques to understand the context of a conversation and sustain it across multiple exchanges with the user.
Dialogue management is an important aspect of natural language processing because it allows computer programs to interact with people in a way that feels more like a conversation than a series of one-off interactions. This can help build trust and engagement with users and ultimately lead to better outcomes for the user and the organization using the program.
Marketers, of course, want to expand on how trust is built, but this is also an area that can be scary because it’s a way for an AI to be able to manipulate the people who use it.
Conclusion on How ChatGPT Works
Although we’re pushing 2,500 words, this is still a very rudimentary overview of everything that goes on inside ChatGPT. That said, maybe now you understand a little more about why this technology has exploded in recent months. The key to all of this is that the data itself is not “supervised” and the AI can take what it has been fed and make sense of it.
In closing, I fed a draft of this entire article to ChatGPT and asked the AI to describe the article in one sentence. Here you go:
ChatGPT is like Google and Wolfram Alpha’s smart cousin, which can do things they can’t, like write stories and code modules.
ChatGPT is supposed to be egoless technology, but if that answer doesn’t give you the creeps, you’re not paying attention. Welcome to the world of Artificial Intelligence!