Course 2 · Module 05 · 65 minutes

Find structure
nobody told
you to look for.

Until now, you've trained models on labeled data — "this is spam, this isn't." Unsupervised learning is the opposite: you give the algorithm raw data with no labels, and it discovers patterns. Customer segments. Anomaly clusters. The hidden groups you didn't know existed.

You'll watch

K-means converge live

You'll adjust

k with a slider

You'll segment

A real dataset

Find the patterns

Part 01 · The shift

No labels.
No teacher.

In supervised learning (Modules 3 & 4), every training example came with the right answer. In unsupervised learning, you just have data — and the algorithm has to find the structure on its own. Both are essential. They solve different problems.

// Modules 3 & 4 · Supervised

You knew the answer.

"Here are 1000 examples with the right answer. Learn the pattern, then predict on new ones."

You have

Features (X) + Labels (y)

Goal

Predict y from X on new data

Examples → Spam classifier (label: spam/not)
→ House price predictor (label: price)
→ Disease diagnosis (label: condition)

// This module · Unsupervised

You don't.

"Here are 1000 data points. Find the structure. Tell me what you see."

You have

Features (X) — no labels at all

Goal

Discover groups, anomalies, or simpler structure

Examples → Customer segmentation (who behaves alike?)
→ Anomaly detection (what's weird?)
→ Topic modeling (what themes exist?)

Part 02 · Hands on · Live algorithm

Watch k-means
find the clusters.

150 data points. No labels. Below: the actual k-means algorithm running step-by-step, in your browser. Adjust k. Press "Step" to do one iteration at a time, or "Run" to watch it converge. Try "Shuffle" to see how random initialization changes the outcome.

How to play.

Set k (number of clusters) with the slider. Then either press Step to do one iteration (assign points → move centroids), or Run to animate to convergence. Shuffle randomizes the starting centroids — important! Different starts can give different final clusters. That's the k-means catch.

150 data points · no labels

Diamond = centroid · circles = points colored by current cluster

// Number of clusters (k) 4

12345678

Iteration 0

Inertia —

Status Initialized

Ready · centroids randomly placed

Part 03 · The algorithm

Four steps. Forever.
That's all it does.

K-means (Lloyd's algorithm, 1957) is one of the oldest ML algorithms and still one of the most used. The mechanics fit in four lines.

Place k centroids randomly

Pick k random points in the data space. These are your initial cluster centers. Their position will move as the algorithm runs — but where you start matters more than you'd think.

// random_state controls reproducibility

Assign each point to nearest centroid

For every data point, compute its distance to each centroid. Assign it to the closest one. Now you have k provisional clusters.

// usually Euclidean distance: √Σ(xᵢ − cᵢ)²

Move centroids to cluster means

For each cluster, compute the average position of all its assigned points. Move the centroid to that average. This is the "means" in k-means.

// new_centroid = mean of assigned points

Repeat until centroids stop moving

Go back to step 2. Assignments may change because the centroids moved. Loop until centroids barely move between iterations — convergence. Usually 5-20 iterations.

// stop when Δcentroid < tolerance

Part 04 · The hardest question

How do you
pick k?

K-means won't tell you how many clusters to use. You have to choose. The most-used trick is the "elbow method" — try several values of k, plot the result, and look for where the improvement bends.

The elbow method

For each k from 1 to ~10, run k-means and record the inertia (sum of distances from each point to its centroid). Inertia always drops as k increases — but at some point, the marginal improvement gets tiny. That's the elbow. That's your k.

The intuition: Below the elbow, each extra cluster captures a real group. Above the elbow, you're just splitting noise.

Elbow method — visual; cheap to compute. Most common in practice.

Silhouette score — measures how "tight" each cluster is. Higher is better. More principled but slower.

Domain knowledge — sometimes you just know there should be ~3 customer segments. Don't overthink it.

Part 05 · Real segmentation

Segment 500 customers
with three lines of sklearn.

A real customer dataset (annual spend + visit frequency). Find natural groups. Use the elbow method to pick k. Visualize the segments. This is what your data team is doing in their Slack #segmentation channel today.

Python runtime + scikit-learn Loading Pyodide... 0%

Part 06 · Beyond k-means

Three other unsupervised
methods you'll meet.

K-means is the workhorse, but it's not the only tool. Each of these handles things k-means can't.

Hierarchical

Agglomerative Clustering

Builds a tree of merges: start with each point as its own cluster, then repeatedly merge the closest pair. You get every level of clustering at once.

Use when: you don't know k, and you want a dendrogram

Density-based

DBSCAN

Finds clusters of any shape (not just blobs). Marks outliers as "noise" instead of forcing them into a cluster. Doesn't need k upfront — just two density parameters.

Use when: clusters are non-spherical or you want anomaly detection

Dim. reduction

PCA

Different goal: compress many dimensions into a few while keeping the structure. Used before plotting high-D data, or as input to other models. Principal Component Analysis.

Use when: too many features to visualize or to feed downstream

Course 2 · Module 05 complete

You can now find groups
without being told what to look for.

You ran k-means yourself, watched it converge, segmented real customers, and know when to reach for DBSCAN or PCA instead. That covers the practical unsupervised toolkit — and you've now done the full classical ML quartet: regression, classification, trees, and clustering.

Up next · Course 2 · Module 06

Neural Networks from Scratch

The pivot to deep learning starts here. You'll build a single neuron in JavaScript, watch backpropagation happen step by step, then train a real neural net in the browser. After this module, "deep learning" stops being a black box.

Continue to Module 06

Find structurenobody toldyou to look for.

No labels.No teacher.

You knew the answer.

You don't.

Watch k-meansfind the clusters.

Four steps. Forever.That's all it does.