What Is GPT? GPT-3, GPT-4, and More Explained

Written by Jessica Schulze • Updated on

An overview and comparison of GPT models 1-4, Amazon’s GPT-55X, and more.

[Featured Image] Blue lines of binary code ripple across a black screen in waves.

Artificial intelligence (AI) has generated more than just content in recent years. It’s sparked debate, excitement, criticism, and innovation across various industries. One of the most notable and buzz-worthy AI technologies today is GPT, often incorrectly equated to Chat-GPT. In the following article, you can learn what GPT is, how it works, and what it’s used for. We’ll also compare and contrast different GPT models, starting with the original transformer and ending with today’s most recent and advanced entry in OpenAI’s catalog: GPT-4. 

What does GPT stand for?

Generative: Generative AI is a technology capable of producing content, such as text and imagery. 

Pre-trained: Pre-trained models are saved networks that have already been taught to resolve a problem or accomplish a specific task using a large data set.

Transformer: A transformer is a deep learning architecture that transforms an input into another type of output. 

Breaking down the acronym above helps us remember what GPT does and how it works. GPT is a generative AI technology that has been previously trained to transform its input into a different type of output. 

What is GPT?

GPT models are general-purpose language prediction models. In other words, they are computer programs that can analyze, extract, summarize, and otherwise use information to generate content. One of the most famous use cases for GPT is Chat-GPT, an artificial intelligence (AI) chatbot app based on the GPT 3.5 model that mimics natural conversation to answer questions and respond to prompts. GPT was developed by the AI research laboratory OpenAI in 2018. Since then, OpenAI has officially released three iterations of the GPT model: GPT-2, GPT-3, and GPT-4. 

Read more: Machine Learning Models: What They Are and How to Build Them

Large language models (LLMs)

The term large language model is used to describe any large-scale language model that was designed for tasks related to natural language processing (NLP). GPT models are a subclass of LLMs. 



GPT-1 is the first version of OpenAI’s language model. It followed Google’s 2017 paper Attention is All You Need, in which researchers introduced the first general transformer model. Google’s revolutionary transformer model serves as the framework for Google Search, Google Translate, autocomplete, and all large language models (LLMs), including Bard and Chat-GPT. 


GPT-2 is the second transformer-based language model by OpenAI. It’s open-source, unsupervised, and trained on over 1.5 billion parameters. GPT-2 was designed specifically to predict and generate the next sequence of text to follow a given sentence. 


The third iteration of OpenAI’s GPT model is trained on 175 billion parameters, a sizable step up from its predecessor. It includes OpenAI texts such as Wikipedia entries as well as the open-source data set Common Crawl. Notably, GPT-3 can generate computer code and improve performance in niche areas of content creation such as storytelling. 


GPT-4 is the most recent model from OpenAI. It’s a large multimodal model (LMM), meaning it's capable of parsing image inputs as well as text. This iteration is the most advanced GPT model, exhibiting human-level performance across a variety of benchmarks in the professional and academic realm. For comparison, GPT-3.5 scored in the bottom 10 percent of test-takers in a simulated bar exam. GPT-4 scored in the top 10 percent. 

Amazon’s GPT55X

Amazon’s Generative Pre-trained Transformer 55X (GPT55X) is a language model based on OpenAI’s GPT architecture and enhanced by Amazon’s researchers. A few key aspects of GPT-55X include its vast amount of training data, ability to derive context dependencies and semantic relationships, and autoregressive nature (using past data to inform future data). 


How does GPT work?

Generative pre-trained transformers are a type of neural network model. As a reminder, neural networks are AI algorithms that teach computers to process information like a human brain would. More specifically, transformers are based on attention mechanisms, a deep learning technique that simulates human attention by ranking and prioritizing input information by importance. Both in our brains and in machine learning models, attention mechanisms help us filter out irrelevant information that can distract from the task at hand. They increase model efficiency by gleaning context and relevance from relationships between elements in data.

How to use GPT-3 and GPT-4

Despite the complexity of these language models, their interfaces are relatively simple. If you’ve ever used Chat-GPT, you’ll find the text-input, text-output interaction familiar. In fact, you can play around with GPT-3.5 via chat.openai.com as long as you have an OpenAI account. To train your own model or experiment with the GPT-3 application programming interface (API), you’ll need an OpenAI developer account (sign up here). After you’ve signed up and signed in, you’ll gain access to the Playground, a web-based sandbox you can use to experiment with the API. 

If you have a subscription to Chat-GPT Plus, you can access GPT-4 via chat.openai.com. At the top of the interface, there’s a tab for GPT-3.5 on the left and GPT-4 on the right. Note that there is a usage cap that depends on demand and system performance. If you want access to the GPT-4 API, it is accessible only after a payment of $1 or more. 

How to use GPT-2 

GPT-2 is less user-friendly than its successors and requires a sizable amount of processing power. However, it is open-source and can be used in conjunction with free resources and tools such as Google Colab. To access the GPT-2 model, start with this GitHub repository. You’ll find a data set, release notes, information about drawbacks to be wary of, and experimentation topics Open-AI is interested in hearing about. 


Empower yourself with GPT expertise on Coursera 

Take a deeper dive into use cases, benefits, and risks of using the GPT model by enrolling in the intermediate-level online course, Generative Pre-trained Transformers (GPT). Or, introduce yourself to NLP AI with Chat-GPT by learning to manipulate its responses and experiment with its tokens and parameters in this Guided Project: Chat-GPT Playground For Beginners: Intro to NLP AI.

Keep reading

Updated on
Written by:


Jessica is a technical writer who specializes in computer science and information technology. Equipp...

This content has been made available for informational purposes only. Learners are advised to conduct additional research to ensure that courses and other credentials pursued meet their personal, professional, and financial goals.