Interactive Guide

How LLMs Actually Work

Not magic. Not mystery. Just mathematics, beautifully orchestrated. Explore the mechanics that power ChatGPT, Claude, and every AI assistant you use.

🧠 Learn by Doing

Interactive demos let you adjust weights, watch learning happen, and see predictions form in real-time.

🔢 Numbers, Not Magic

Every "intelligent" response is just billions of numbers, multiplied and added. See exactly how.

âš¡ Simple Core Ideas

The fundamental concepts are surprisingly simple. The power comes from scale, not complexity.

1

What Are Weights?

The "knowledge" of an AI is stored in numbers called weights. Billions of them.

Think of Weights as Knobs

Imagine a mixing board with billions of knobs. Each knob controls how much one piece of information influences the final output. That is what weights are.

When we say GPT-4 has "billions of parameters," we mean it has billions of these adjustable numbers. During training, the AI learns by adjusting these knobs until it produces good outputs.

Try it yourself: Adjust the weights below and watch how the output changes.

Simple Prediction: Price = Size × Weight

$300,000
Predicted House Price

Key Insight

Real AI models work exactly like this, but with billions of weights instead of one. GPT-4 has around 1.7 trillion weights. Each one is a number like 0.0234 or -1.892. The "intelligence" emerges from how all these numbers work together.

2

The Learning Loop

How does an AI "learn"? By making mistakes and adjusting. Over and over and over.

1 Predict
2 Measure Error
3 Adjust
4 Repeat
Learning Visualization
0
Iterations
50
Current Weight
200
Target Weight
High
Error

This is ALL of Machine Learning

Every AI, from simple classifiers to GPT-4, learns through this same loop: Predict → Measure Error → Adjust → Repeat. The only differences are the complexity of the model and the amount of data. ChatGPT did this loop trillions of times on internet text.

3

From Words to Numbers

Computers cannot read. They need numbers. So we convert text into vectors of numbers called embeddings.

Tokenization Demo
Text
→
Tokens
→
Numbers
→
Embeddings

Click on a token to see its embedding visualization. Each token becomes a vector of ~1500 numbers.

What is a Token?

A chunk of text - could be a word, part of a word, or punctuation. "understanding" might become ["under", "stand", "ing"].

What is an Embedding?

A list of numbers that captures the "meaning" of a token. Similar words have similar numbers.

Why Does This Work?

Words that appear in similar contexts get similar embeddings. "King" and "Queen" end up close in number space.

4

Inside a Neural Network

Data flows through layers of neurons. Each layer transforms the data, extracting more abstract patterns.

Neural Network Visualization

Forward Pass

Data flows left → right. Each neuron multiplies inputs by weights, adds them up, and applies an activation function.

Backward Pass

Error flows right → left. Each weight learns how much it contributed to the error, then adjusts itself.

5

How LLMs Generate Text

LLMs do not "think" of whole sentences. They predict one token at a time, then use that prediction to predict the next.

Text Generation Demo
Generated Output:
The quick brown fox

The Autoregressive Secret

This is called autoregressive generation. The model only knows how to predict the next token. To write a paragraph, it predicts token 1, adds it to the input, predicts token 2, adds it, and so on. A 500-word response requires ~750 predictions, each using all previous tokens as context.

You Now Understand the Fundamentals

The rest is details and scale. Billions of weights. Trillions of training examples. But the core ideas? You just learned them.

Read More Essays