Deep Sequence Modelling: RNN
This track introduces recurrent neural networks and practical sequence modelling: when order matters, how real data forms sequences, and which input→output patterns show up in NLP, speech, video, and time series.
Prerequisites
Deep Neural Networks
Layers, backpropagation, activation functions, overfitting
Linear Algebra & Calculus
Matrix multiplication, chain rule, partial derivatives
Lessons
Foundations of deep sequence modeling
History and context matter — ignore them at your peril
From static networks to time-aware models
Recurrence lets a network carry memory across time steps
RNN internal mechanics & formal structure
Shared weights across time; tanh squashes hidden state
Bringing sequence modeling to the real world
Words → tokens → embeddings → flow through the RNN
Training RNNs: BPTT & gradient pathologies
Unroll the RNN through time; gradients vanish or explode
Training an RNN in PyTorch
nn.RNN, hidden states, autograd BPTT, training loop
Unlocks
Attention Is All You Need
Replace recurrence with parallelisable self-attention
LSTMs & GRUs
Gated mechanisms that solve the vanishing gradient problem