What Does "Generative" Actually Mean?

Most AI you've encountered is discriminative β€” it looks at something that already exists and puts a label on it. A spam filter reads your email and decides "spam or not spam." An image classifier looks at a photo and says "cat" or "dog." These systems are incredibly useful, but they never produce anything new.

Generative AI does the opposite. Instead of labelling existing content, it creates brand-new content from scratch β€” text, images, audio, video, even code. ChatGPT writes an essay. Midjourney paints a fantasy landscape. GitHub Copilot suggests a function. Suno composes a pop song. None of that content existed before you asked.

Think of the difference this way: a discriminative model is a wine critic who can tell you whether a wine is good or bad. A generative model is the winemaker who produces a new vintage β€” one that has never existed before.

The Main Families of Generative AI

Under the hood, there are several distinct approaches to generating content. The three you'll hear about most are Large Language Models (LLMs), GANs, and Diffusion Models.

Large Language Models β€” Next-Token Prediction at Scale

LLMs like the one behind ChatGPT generate text by doing one surprisingly simple thing, over and over: predicting the most likely next word (technically "token") given everything that came before it. Feed it "The cat sat on the…" and it assigns probabilities to thousands of possible next words β€” "mat," "floor," "roof" β€” then samples one, then predicts the word after that, and so on.

The "large" part matters. These models are trained on enormous slices of the internet and books, which gives them a rich statistical sense of how language, facts, and reasoning flow. To go deeper on how transformers power this process, see our Lesson 8: Transformers & LLMs explainer. And if you're curious specifically about ChatGPT, check out How Does ChatGPT Work?

GANs β€” The Forger and the Detective

Generative Adversarial Networks, or GANs, were the breakthrough behind eerily realistic "this person does not exist" portraits. They work through a competitive game between two neural networks:

Analogy Imagine a master counterfeiter printing fake banknotes and a police forensics expert examining them. Every time the expert spots a flaw, the counterfeiter goes back to the drawing board and improves the notes. Every time the notes get better, the expert sharpens their detection skills. After thousands of rounds, the counterfeiter is producing notes so good that even the expert can barely tell them apart. That's a GAN.

GANs were dominant in the early days of AI-generated faces and artistic style transfer. Their weakness is training instability β€” the two networks can fall out of balance, leading to repetitive or nonsensical output.

Diffusion Models β€” Sculpting from Static

Diffusion models power today's most popular image generators β€” Stable Diffusion, DALL-E, and Midjourney all rely on this approach. The core idea is delightfully counterintuitive: start with pure noise and gradually clean it up.

During training, the model learns to reverse a noising process. Think of it as TV static: each step, the model learns to remove a tiny bit of the "snow" until a clear picture emerges. When you type a prompt, the model starts from random noise and takes hundreds of small denoising steps, each guided by your words, until a coherent image appears.

Analogy It's like a sculptor who starts with a block of marble (the static/noise) and chips away a little at a time. They don't draw the statue first and fill it in β€” they reveal it by removing what doesn't belong, guided by a vision of the final form.

Compared to GANs, diffusion models are more stable to train and produce strikingly diverse, high-quality outputs. The trade-off is speed: all those denoising steps take more computation than a single GAN forward pass.

Model FamilyHow It GeneratesTypical UseKey Strength
LLMsPredicts next tokenText, code, chatFluent language & reasoning
GANsForger vs. DetectiveFaces, style transferFast, sharp output
DiffusionDenoises from staticImages, video, audioDiversity & quality

The Temperature Knob β€” How Randomness Shapes Output

Every generative AI system has some concept of randomness baked in. For LLMs, it's made explicit through a setting called temperature. When the model has assigned probabilities to possible next tokens, temperature controls whether it plays it safe or takes a risk.

Image generators have an equivalent idea: the amount of "guidance" applied at each denoising step. More guidance = closer to your prompt but potentially less creative. Less guidance = dreamier, less literal results.

Key takeaway Temperature is the creativity dial. Crank it up for brainstorming and art; turn it down when you need reliable, on-topic answers. You can experiment with this directly in Lesson 9: Generative AI.

Real-World Uses

Generative AI has moved from research labs into daily tools faster than almost any technology in history. Current applications include:

A Balanced Note on Risks

Generative AI is powerful enough that its misuse is a genuine concern, not a sci-fi worry:

Being informed about how these systems work β€” which is exactly what you're doing β€” is the first step toward using them responsibly.

Frequently Asked Questions

Is ChatGPT generative AI?

Yes. ChatGPT is built on GPT-4 (and later models), which are Large Language Models β€” a type of generative AI. It generates new text by predicting one token at a time, guided by your prompt and the conversation history. It doesn't retrieve pre-written answers; it constructs each response on the fly.

How do AI image generators work?

Most modern AI image generators β€” including Stable Diffusion, DALL-E, and Midjourney β€” use diffusion models. You provide a text prompt; the model starts from pure random noise and runs it through hundreds of denoising steps, each step steered by the meaning of your words, until a coherent image emerges. The text is understood by a separate language model (like CLIP) that acts as a bridge between words and pixels.

What is temperature in AI?

Temperature is a number (often between 0 and 2) that controls how much randomness an LLM uses when choosing its next word. At temperature 0 the model is deterministic β€” it always picks the most likely word. At higher temperatures it samples more broadly, producing more surprising and creative β€” but sometimes less accurate β€” output. Many AI tools expose this as a "creativity" slider.

Key takeaway Ready to see generative AI in action? In Lesson 9 you can tweak the temperature dial yourself, watch how output changes, and play with a generative art sandbox β€” no code required.

Play with the temperature dial in Lesson 9 β†’