Learning-rate range testMedium

Learning-rate range test

Background

The learning-rate range test (Leslie Smith, popularized by fast.ai) finds a good learning rate cheaply: run a few hundred iterations while increasing the LR geometrically from tiny to large, recording the loss at each LR. Plotting loss against log10(lr)\log_{10}(\text{lr}) gives a curve that descends, then explodes. A common heuristic picks the LR at the point of steepest descent — where loss is dropping fastest — which sits safely below the LR that minimizes the loss.

Problem statement

Implement suggest_lr(lrs, losses) that returns the learning rate at the point of steepest descent of the loss curve:

gi=dlossdlog10(lr)i,suggested lr=lrs[argminigi]g_i = \frac{d\,\text{loss}}{d\,\log_{10}(\text{lr})}\Big|_i, \qquad \text{suggested lr} = \text{lrs}[\arg\min_i g_i]

Use np.gradient(losses, np.log10(lrs)) to estimate the slope, then return the LR where that slope is most negative.

Input

  • lrsnp.ndarray of learning rates, strictly increasing (typically geometric).
  • lossesnp.ndarray of recorded losses, same length as lrs.

Output

A float: the learning rate at the steepest-descent point.

Examples

Example 1

Input:  lrs = [1e-4, 1e-3, 1e-2, 1e-1, 1.0]
        losses = [2.5, 2.3, 1.8, 1.0, 3.0]
Output: 0.01

Explanation: taking the gradient of loss with respect to log10(lr)\log_{10}(\text{lr}) gives slopes [0.2,0.35,0.65,0.6,2.0][-0.2, -0.35, -0.65, 0.6, 2.0]. The most negative slope is at index 2, so the suggested LR is 102=0.0110^{-2}=0.01.

Constraints

  • Differentiate loss with respect to log10(lr)\log_{10}(\text{lr}), not raw lr (the test sweeps LR on a log scale).
  • The suggested LR is the one at the minimum (most negative) gradient.
  • Return one of the values in lrs as a float.

Notes

  • Picking the steepest-descent LR (rather than the loss-minimizing LR) leaves headroom before the curve blows up — the loss-min LR is often already too large to train stably.
  • Real implementations smooth the loss (EMA) before differentiating to reduce noise; here the raw losses are used.
Python
Loading...

This problem ships 4 hidden tests. They run in your browser via Pyodide — no backend, no submission queue. Press ▶ Run tests to execute.

  • Reference example
  • Returned LR is one of the inputs
  • Picks the steepest (most negative central-difference) slope
  • Differentiates on a log scale (geometric lrs)