numpy
150 problems
- Accuracy scoreEasy
- AdaBoost fit (decision stumps)Medium
- Adadelta optimizerMedium
- Adagrad optimizerEasy
- Adam optimiser stepMedium
- Adamax optimizerMedium
- AdamW step (decoupled weight decay)Medium
- Affine coupling layer (normalizing flow)Medium
- Autocorrelation functionMedium
- Autoencoder forward passEasy
- Bag-of-words encodingEasy
- BatchNorm forward (train + eval modes)Medium
- Beam search decodingMedium
- Bellman equation for value iterationMedium
- Bernoulli Naive Bayes classifierMedium
- Best Gini-based split (decision tree)Medium
- Binary classification with logistic regressionEasy
- Conv2D forward (naive, stride 1, no pad)Medium
- Conv2D forward (padding + stride)Medium
- Max pooling 2D forwardEasy
- Average pooling 2D forwardEasy
- Conv2D backward (dx, dW)Hard
- Mini-CNN forward (capstone)Medium
- Token embedding lookupEasy
- Sinusoidal positional encodingEasy
- Scaled dot-product self-attentionMedium
- Causal mask: build + applyEasy
- Multi-head split + combineMedium
- Multi-Head Attention (full layer)Medium
- LayerNorm forwardEasy
- Transformer block forward (pre-LN, residual)Hard
- Linear forward (Wx + b)Easy
- ReLU forward + backwardEasy
- Sigmoid forward + backwardEasy
- MSE loss + backwardEasy
- Linear backward (chain rule)Medium
- SGD step (in place)Easy
- Train a 2-layer MLP on XORHard
- Cohen's kappa scoreMedium
- Confusion matrixEasy
- Contrastive lossMedium
- Warmup + cosine decay LR scheduleEasy
- Covariance matrixEasy
- Cross-entropy gradientMedium
- DBSCAN clusteringMedium
- Dense block with 2D convolutionsMedium
- Dice lossEasy
- Diffusion forward processMedium
- Divide dataset by feature thresholdEasy
- DPO lossMedium
- Dropout layer (forward & backward)Medium
- Dynamic-Tanh (DyT)Medium
- Efficient sparse window attentionMedium
- Elastic-Net regression (gradient descent)Medium
- Euclidean distance matrixEasy
- Exponential moving average of weightsEasy
- F1 score (binary classification)Easy
- Fisher-Yates shuffleEasy
- FlashAttention tiled forwardHard
- Focal loss (multiclass)Medium
- Gaussian process regression (RBF)Hard
- GCN layer (message passing)Medium
- GELU backward (tanh approximation)Medium
- GELU forward (tanh approximation)Easy
- Generate random subsets (bootstrap)Easy
- GMM E-step (responsibilities)Medium
- Gradient checkpointing forwardEasy
- Clip gradients by global L2 normEasy
- Graph LaplacianMedium
- Greedy decoding loopEasy
- Group normalizationMedium
- Grouped-query attentionMedium
- GRPO objectiveHard
- He weight initializationEasy
- Huber & Hinge lossesEasy
- Instance normalizationMedium
- Inverse-transform samplingEasy
- Jaccard similarityEasy
- k-fold cross-validationEasy
- KL divergence (discrete)Easy
- K-means: one iterationMedium
- k-NN classification (majority vote)Easy
- KV cache for autoregressive inferenceMedium
- KV cache compression (MLA)Hard
- Label-smoothed cross-entropyMedium
- Leaky ReLU activationEasy
- Learning-rate range testMedium
- Linear regression — gradient descentEasy
- Linear regression — normal equationMedium
- Lion optimizer stepMedium
- Log-softmaxEasy
- Logistic regression — gradient descentMedium
- LoRA forward passEasy
- Matmul backwardMedium
- Multi-class cross-entropy lossEasy
- Mutual informationMedium
- Nesterov accelerated gradientMedium
- Neural ODE forward EulerMedium
- Noisy top-k gatingMedium
- Orthonormal basis (Gram-Schmidt)Medium
- PageRank (power iteration)Medium
- Pairwise cosine-similarity matrixEasy
- Pearson correlation coefficientEasy
- Pegasos kernel SVMMedium
- Perplexity from log-probsEasy
- Pointwise mutual information (PMI)Medium
- Policy gradient with REINFORCEMedium
- PPO clipped objectiveMedium
- Precision@k and NDCG@kMedium
- Precision metricEasy
- Principal Component Analysis (PCA)Medium
- Prioritized experience replayMedium
- Q-learning for MDPsMedium
- Rejection samplingMedium
- Repetition penalty (HuggingFace-style)Medium
- Reservoir samplingMedium
- Residual block with shortcutEasy
- Ridge regression lossEasy
- RMSNormEasy
- RMSprop optimizerEasy
- Vanilla RNN cell forwardMedium
- ROC-AUC from scratchMedium
- Rotary Position Embedding (RoPE)Hard
- Scaled dot-product attentionMedium
- SELU activationEasy
- Sequence padding & maskingEasy
- SGD with momentum (one step)Easy
- Shannon entropyEasy
- Singular Value Decomposition (2x2)Hard
- Softmax backwardMedium
- Softmax from scratchEasy
- Softmax (multinomial) regressionMedium
- Sorted polynomial featuresMedium
- Sparse mixture-of-experts layerHard
- Speculative decoding verificationMedium
- SwiGLU activationMedium
- Swish / SiLU activationEasy
- TD(0) value updateEasy
- Temperature scalingEasy
- TF-IDFMedium
- Time-series anomaly detectionMedium
- Top-k samplingEasy
- Top-p (nucleus) samplingMedium
- Triplet lossMedium
- UCB1 multi-armed banditEasy
- VAE ELBO lossMedium
- Viterbi algorithmMedium
- Wasserstein distance (1-D)Medium
- Weighted cross-entropyEasy
- Weighted multinomial samplingEasy