SGD step (in place)
Background
Stochastic gradient descent is the simplest optimiser there is: nudge each parameter a little way down its gradient. Writing it correctly is mostly about two things people get wrong — mutating the parameters in place (so the caller's arrays actually change) and validating the inputs (so a mismatched params/grads list fails loudly instead of silently skipping updates). Everything fancier — momentum, weight decay, Adam — is built on top of this two-line update.
Problem statement
Implement sgd_step(params, grads, lr). For each parameter tensor with gradient , apply, in place:
The function returns None — it mutates the parameter arrays directly. It raises ValueError if len(params) != len(grads), or if any params[i].shape != grads[i].shape.
Input
params—list[np.ndarray]: the model parameters. Mutated in place.grads—list[np.ndarray]: the gradients; same length asparams, same shape per element.lr— scalar learning rate.
Output
Returns None. Updates happen in place, so the caller sees the new values through its existing references.
Examples
Example 1 — a single parameter step
Input: params=[[1.0, 2.0, 3.0]], grads=[[0.1, 0.2, 0.3]], lr=0.5
Output: params -> [[0.95, 1.9, 2.85]]
Explanation: each element moves by : , , .
Example 2 — mismatched lengths raise
Input: params=[p1, p2] (2 tensors), grads=[g] (1 tensor), lr=0.1
Output: raises ValueError
Explanation: there are two parameters but only one gradient. Without an explicit length check, zip would silently truncate and p2 would never update — a nasty real-world bug — so the function raises instead.
Constraints
- Update in place with
p -= lr * g(ornp.subtract(p, lr*g, out=p)). Writingp = p - lr * grebinds the local name and leaves the caller's array untouched. - Return
None. - Raise
ValueErrorwhenlen(params) != len(grads), and when anyparams[i].shape != grads[i].shape. lr = 0must leave every parameter unchanged.
Notes
- The rebind trap.
p = p - lr*gcreates a new array bound to the localp; the object the caller passed in still holds the old values. In-place-=mutates the underlying buffer, which is what the "reference unchanged" test verifies. - Series. This is the base optimiser; later problems add momentum (SGD with momentum) and adaptive learning rates (Adam) on top of this same update.
This problem ships 6 hidden tests. They run in your browser via Pyodide — no backend, no submission queue. Press ▶ Run tests to execute.
- •Updates a single parameter in place
- •Updates multiple parameters of mixed shapes in place
- •lr=0 leaves parameters unchanged
- •Length mismatch between params and grads raises ValueError
- •Shape mismatch within a pair raises ValueError
- •Mutation happens IN PLACE — caller's reference sees new values