๐Ÿ’ฌ What Is NLP?

Natural Language Processing (NLP) is the branch of AI that gives computers the ability to work with human language โ€” reading it, understanding its meaning, and even producing it. Sounds simple? Language is actually one of the hardest problems in AI.

๐Ÿฆ† Analogy โ€” "I saw her duck" Did she duck (dodge)? Or did she own a duck (the bird)? Same words, two totally different meanings. Humans resolve this in a flash using context. Machines had to learn every one of these ambiguities from billions of examples.

Language is full of sarcasm, idioms, pronouns, spelling errors, cultural references, and ever-changing slang. Every sentence is a puzzle. NLP is the set of tools โ€” from simple word counts to massive neural networks โ€” that lets machines start solving that puzzle.

โœ‚๏ธ Tokenization: Chopping Text Into Pieces

Before a machine can "read" text, it needs the text as a list of standardised pieces it can process. We call each piece a token. Tokenization is the act of splitting text into tokens.

๐Ÿ’ก Tip โ€” Not just words Modern models often use subword tokens. The word "unbelievable" might split into un + believ + able. This helps handle rare words โ€” even words the model has never seen can be built from familiar parts.

After tokenization, each token gets a unique integer ID. The sentence "The cat sat." might become [482, 1751, 992, 13]. The model works with those numbers, not letters.

Example: tokenizing the sentence below (simple whitespace + punctuation split)

Hover a chip to see a sample token ID (illustrative).

๐Ÿ”ข Turning Words Into Numbers: Embeddings

Neural networks only understand numbers. So we need a way to turn every word into a number โ€” or better, a vector (a list of numbers). The simplest approach, Bag of Words, gives each word a slot in a giant array: 1 if the word appears in a document, 0 if not. It works, but it loses order and meaning.

A much richer idea: Word Embeddings. Train the model so that each word maps to a compact vector of, say, 300 numbers, learned such that similar words end up near each other in that vector space.

๐Ÿ“ Analogy โ€” Words on a map Imagine plotting words on a map where cities represent meanings. "Paris" and "London" sit close together (both capitals). "Dog" and "Cat" cluster near each other (both pets). And the directions on the map are consistent: the vector from "king" to "queen" is almost the same as from "man" to "woman" โ€” the direction encodes gender. That's why the famous equation works:

king โˆ’ man + woman โ‰ˆ queen
king
[0.92, 0.18,
0.74, 0.05,
...]
queen
[0.89, 0.81,
0.71, 0.09,
...]
dog
[0.12, 0.09,
0.03, 0.88,
...]
cat
[0.14, 0.11,
0.02, 0.85,
...]

Notice: king/queen have similar first three values (royalty dimension). dog/cat are similar in the fourth (animal dimension). Embeddings capture these patterns automatically from data.

๐Ÿ› ๏ธ What Can NLP Do?

NLP powers a huge range of applications you use every day:

๐Ÿ˜Š
Sentiment Analysis
Is this review positive, negative, or neutral?
๐ŸŒ
Translation
Convert text from one language to another.
๐Ÿ“
Summarisation
Squeeze a long document to its key points.
โ“
Question Answering
Find the answer to a question inside a passage.
๐Ÿ”ฎ
Autocomplete
Predict the next word (or sentence) a user will type.
๐Ÿท๏ธ
Named Entity Recognition
Spot people, places, organisations in text.
๐Ÿš€ Coming up โ€” Lesson 08: Transformers & LLMs Classic NLP methods (bag-of-words, simple word embeddings, RNNs) paved the road โ€” but in 2017 the Transformer architecture arrived and changed everything. The next lesson explores how Transformers handle context, attention, and scale up to GPT-level language models.
Sentiment Detector ๐Ÿ˜Š๐Ÿ˜ก Interactive

Type a sentence or short review below. The detector uses a built-in lexicon of positive and negative words, handles simple negation ("not good"), and shows which words influenced the score.

Next-Word Oracle ๐Ÿ”ฎ (a tiny language model) Interactive

This is a bigram Markov model โ€” the simplest possible language model. It learned, from a short built-in corpus, which words tend to follow each other. Pick a starting word, choose a length, and watch it generate text by repeatedly sampling the most likely next word. This is the same core idea behind GPT โ€” just at a microscopic scale.

Your generated text will appear hereโ€ฆ

How it works: The model scanned every pair of consecutive words in the corpus. When generating, it looks up the current word, picks a random successor weighted by how often each one appeared, then repeats. Bigger models do the same thing โ€” just with much more data and context.