Multi-class cross-entropy lossEasy

Multi-class cross-entropy loss

Background

Multi-class cross-entropy is the loss for softmax classifiers: it penalises the model by the negative log-probability it assigned to the true class. Minimising it is equivalent to maximising the likelihood of the correct labels, and it's the standard objective for every neural-network classifier with more than two classes.

Problem statement

Implement compute_cross_entropy_loss(predicted_probs, true_labels, epsilon=1e-15):

L=1Ni=1Nc=1Cyiclogp^icL = -\frac{1}{N}\sum_{i=1}^{N}\sum_{c=1}^{C} y_{ic}\,\log \hat{p}_{ic}

where p^\hat{p} are predicted class probabilities and yy are one-hot labels. Clip p^\hat{p} to [ϵ,1ϵ][\epsilon,\, 1-\epsilon] before taking the log.

Input

  • predicted_probsnp.ndarray (N, C): predicted probabilities per class (rows sum to ~1).
  • true_labelsnp.ndarray (N, C): one-hot true labels.
  • epsilonfloat: clipping constant.

Output

Returns a float: the mean cross-entropy over the NN samples.

Examples

Example 1

Input:  predicted_probs = [[0.7, 0.2, 0.1], [0.3, 0.6, 0.1]], true_labels = [[1,0,0],[0,1,0]]
Output: 0.4338

Explanation: the model gave the true classes probabilities 0.7 and 0.6; the loss is 12(log0.7+log0.6)0.4338-\tfrac{1}{2}(\log 0.7 + \log 0.6) \approx 0.4338.

Constraints

  • Clip predicted probabilities to [ϵ,1ϵ][\epsilon,\, 1-\epsilon] before the log (avoids log0\log 0).
  • Sum over classes per sample, then average over samples.
  • Return a float; tests compare with atol=1e-4.

Notes

  • Because labels are one-hot, the inner sum collapses to a single term: logp^i,yi-\log \hat{p}_{i, y_i}, the negative log-prob of the correct class.
  • Paired with the softmax that produced p^\hat{p}, the gradient with respect to the logits simplifies to p^y\hat{p} - y.
Python
Loading...

This problem ships 4 hidden tests. They run in your browser via Pyodide — no backend, no submission queue. Press ▶ Run tests to execute.

  • Reference example: 0.4338
  • Confident correct predictions -> near 0
  • Uniform predictions -> -log(1/C)
  • Averages over samples