ROC-AUC from scratch
Background
The ROC curve plots the true-positive rate against the false-positive rate as the decision threshold sweeps from high to low; the AUC (area under it) is a single, threshold-independent score for a binary classifier's ranking quality. AUC is perfect ranking, is random, and it equals the probability that a randomly chosen positive is scored above a randomly chosen negative.
Problem statement
Implement roc_auc(scores, labels) returning the area under the ROC curve via the equivalent rank/pairwise definition (the Mann-Whitney U statistic): over all positive-negative pairs, count how often the positive outscores the negative, with ties counting a half:
Input
scores— array-like offloat: predicted scores/probabilities (higher = more positive).labels— array-like of{0, 1}: the true binary labels.
Output
Returns a float in .
Examples
Example 1
Input: scores = [0.1, 0.4, 0.35, 0.8], labels = [0, 0, 1, 1]
Output: 0.75
Explanation: positives score and negatives . Of the 4 positive-negative pairs, 3 have the positive ranked above the negative (only is out of order), so .
Constraints
- AUC is the fraction of positive-negative pairs ordered correctly, with ties counting .
- Requires at least one positive and one negative; the result lies in .
- Equivalent to integrating the ROC curve, but the pairwise form avoids threshold bookkeeping.
Notes
- AUC depends only on the ordering of scores, not their calibration — any monotonic rescaling leaves it unchanged.
- A perfect classifier scores every positive above every negative (AUC 1); reversing the scores gives 0; random scores give 0.5 in expectation.
This problem ships 4 hidden tests. They run in your browser via Pyodide — no backend, no submission queue. Press ▶ Run tests to execute.
- •Reference example: AUC = 0.75
- •Perfect ranking gives AUC 1.0
- •Reversed ranking gives AUC 0.0
- •All-tie scores give AUC 0.5