ML System Design

Reverse-engineer two real production ML systems, YouTube's recommender and a modern RAG search system, layer by layer. Then transfer the framework to four case studies. By the end, the 45-minute ML system design interview round is a rehearsed walkthrough, not a guess.

Updated 16 days ago

ML System Design

This course has one rule: you reverse-engineer real production ML systems, layer by layer, until you can design one in 45 minutes from a blank whiteboard.

By the end you will be able to walk into the ML system design round at Meta E5, Google L5, or Amazon SDE-III and execute a framework you have rehearsed against two real systems and four case studies.

Every lesson follows the same shape: the problem at this layer, the naive solution that breaks, the production design that works, the math, and what the interviewer is actually grading.


The arc

PartWhat you buildLayer
0The 6-step frameworkThe interview itself
1YouTube-style recommenderThe classical interview canon
2Production RAG search systemThe modern interview surface
34 case studiesFramework transfer to ad CTR, fraud, ETA, multimodal
IRInterview readinessMock-round walkthroughs + final quiz

What you reverse-engineer in Part 1 (this is where you start)

LessonWhat you addWhat breaks without it
0The 6-step frameworkYou wing it and lose the round
1Define the problemYou optimise the wrong metric
2Data pipelineYour model trains on lies
3Retrieval (two-tower + ANN)You can't serve at 1B scale
4Ranking (with calibration)A/B test results stop making sense
5Online servingYou blow the latency budget
6Evaluation and A/B testingYou ship novelty effects as wins
7Monitoring and retrainingYour model rots in production

What you reverse-engineer in Part 2

LessonWhat you addWhat breaks without it
8Define the problem (RAG)You confuse "use a vector DB" with a real RAG system
9Retrieval and rerankingPure dense retrieval misses rare entities; no reranker means precision dies
10Generation and LLM servingHallucinations ship; cost runs away; latency tanks
11RAG evaluationYou can't tell if the system is faithful or just fluent

What you transfer in Part 3 (case studies)

The framework now applied to four distinct interview prompts, each lesson walks the full 6 steps end-to-end on one canonical case, highlighting the wrinkles that distinguish it from the two flagships.

LessonCaseThe wrinkle
12Ad CTR prediction (Google-style)Auction mechanics; calibration is mandatory; counterfactual logging
13Real-time fraud detection (Stripe-style)Class imbalance + adversarial drift; streaming features; asymmetric cost
14ETA prediction (Uber-style)Spatial-temporal features; pinball loss; quantile outputs (P50 / P90)
15Multimodal search (CLIP-style)Contrastive joint training (InfoNCE); cross-modal eval; modality alignment

Interview readiness

The closing lesson, two fully-narrated 45-minute mock rounds (recsys and RAG), the framework cheat sheet, the consolidated L4 ceilings and L5 promote signals across the entire course, and the day-of playbook for the actual interview.


What this course is not

It is not a survey of every ML architecture. The architectures appear when the system needs them. It is not a generic SWE system design course either, distributed systems primitives appear in the lessons that need them, never as standalone abstraction.

It is not Exponent or HelloInterview's framework restated. We adopt the 6-step framework as the spine because it is industry-standard, and then we layer in the practitioner-grade depth those courses skip: real latency budgets, real embedding store sizing, real drift detectors, real A/B test power calculations.


Test your understanding

Prof is ready

Prof will ask you questions about ML System Design, course overview — not explain it. You'll be surprised what you don't know until you have to say it.