Huber & Hinge lossesEasy

Huber & Hinge losses

Background

Two robust margin/regression losses. Huber loss blends squared error (for small residuals) with absolute error (for large ones), so a few outliers don't dominate the gradient — the standard robust regression loss. Hinge loss is the SVM loss: it penalises predictions that are correct but not confident, enforcing a margin.

Problem statement

Implement two functions.

huber_loss(pred, target, delta=1.0) — the mean over elements of, with r=predtargetr = \text{pred} - \text{target}:

δ(r)={12r2rδδ(r12δ)r>δ\ell_\delta(r) = \begin{cases} \tfrac12 r^2 & |r| \le \delta \\[2pt] \delta\,(|r| - \tfrac12\delta) & |r| > \delta \end{cases}

hinge_loss(pred, target) — the mean of max(0,1yiy^i)\max(0,\, 1 - y_i\,\hat{y}_i) with labels y{1,+1}y \in \{-1, +1\}:

Lhinge=1Nimax(0,  1yiy^i)\mathcal{L}_{\text{hinge}} = \frac{1}{N}\sum_i \max\big(0,\; 1 - y_i\,\hat{y}_i\big)

Input

  • prednp.ndarray: predictions (raw scores for hinge).
  • targetnp.ndarray: targets (real values for Huber; {1,+1}\{-1, +1\} labels for hinge).
  • deltafloat (Huber only): the transition point between the quadratic and linear regimes.

Output

Each function returns a float: the mean loss.

Examples

Example 1 — Huber

Input:  huber_loss([1, 2, 3], [1.5, 4, 3], delta=1.0)
Output: 0.5417

Explanation: residuals [0.5,2,0][-0.5, -2, 0]. The small one is quadratic 12(0.5)2=0.125\tfrac12(0.5)^2=0.125; the large one is linear 1(20.5)=1.51\cdot(2-0.5)=1.5; the zero contributes 0. Mean =1.625/3=0.5417=1.625/3=0.5417.

Example 2 — Hinge

Input:  hinge_loss([0.8, -0.5, 2], [1, -1, 1])
Output: 0.2333

Explanation: margins yy^=[0.8,0.5,2]y\hat{y}=[0.8, 0.5, 2], so losses max(0,1)=[0.2,0.5,0]\max(0, 1-\cdot)=[0.2, 0.5, 0] and the mean is 0.7/3=0.23330.7/3=0.2333.

Constraints

  • Huber is quadratic when rδ|r|\le\delta and linear (δ(r12δ)\delta(|r|-\tfrac12\delta)) beyond — the two pieces meet smoothly at ±δ\pm\delta.
  • Hinge labels are {1,+1}\{-1, +1\}; points correctly classified beyond the margin (yy^1y\hat{y}\ge 1) contribute 0.
  • Both functions return the mean over elements.

Notes

  • As δ\delta\to\infty Huber becomes MSE; as δ0\delta\to 0 it becomes (scaled) MAE — it interpolates between the two.
  • Hinge is non-smooth at the margin; its sub-gradient is yixi-y_i x_i when yy^<1y\hat{y}<1 and 00 otherwise, exactly what the SVM/Pegasos update uses.
Python
Loading...

This problem ships 4 hidden tests. They run in your browser via Pyodide — no backend, no submission queue. Press ▶ Run tests to execute.

  • Huber reference: 0.5417
  • Huber equals MSE/2 for small residuals
  • Hinge reference: 0.2333
  • Hinge is 0 when all points are beyond the margin