AdaBoost fit (decision stumps)Medium

AdaBoost fit (decision stumps)

Background

AdaBoost turns a crowd of weak learners — here, one-level decision trees ("stumps") — into a strong classifier. It trains stumps in sequence; after each one it re-weights the training examples so the next stump focuses on those still misclassified, and assigns each stump a vote α\alpha proportional to its accuracy. The final prediction is the sign of the weighted vote.

Problem statement

Implement adaboost_fit(X, y, n_clf) that fits n_clf decision stumps by AdaBoost and returns them. Start from uniform sample weights wi=1/nw_i = 1/n. Each round, pick the stump (feature, threshold, polarity) with the smallest weighted error ε\varepsilon, then update:

α=12ln ⁣1εε+1010\alpha = \tfrac{1}{2}\ln\!\frac{1 - \varepsilon}{\varepsilon + 10^{-10}} wiwieαyiy^i,wwjwjw_i \leftarrow w_i\, e^{-\alpha\, y_i\, \hat{y}_i}, \qquad w \leftarrow \frac{w}{\sum_j w_j}

A stump predicts 1-1 where xf<thresholdx_f < \text{threshold} and +1+1 otherwise (flipped when polarity =1=-1). If a stump's weighted error exceeds 0.50.5, set polarity =1=-1 and use 1ε1 - \varepsilon.

Input

  • Xnp.ndarray of shape (n_samples, n_features): numeric features.
  • ynp.ndarray of shape (n_samples,): labels in {1,+1}\{-1, +1\}.
  • n_clfint: the number of stumps (boosting rounds).

Output

Returns a list of n_clf dicts, each a stump with keys feature_index (int), threshold (the split value), polarity (±1\pm 1), and alpha (float, the stump's vote weight).

Examples

Example 1

Input:  X = [[1, 2], [2, 3], [3, 4], [4, 5]], y = [1, 1, -1, -1], n_clf = 3
Output: clfs[0] = {'feature_index': 0, 'threshold': 3, 'polarity': -1, 'alpha': 11.5129}

Explanation: the stump "predict -1 when feature 0 >= 3" (polarity -1, threshold 3) separates the two classes with zero weighted error, so its vote α\alpha is very large and all three rounds re-select the same perfect stump.

Constraints

  • Initialise sample weights uniformly to 1/n1/n.
  • Candidate thresholds are the unique values of each feature; a sample predicts 1-1 when xf<thresholdx_f < \text{threshold}.
  • If the weighted error >0.5> 0.5, set polarity =1=-1 and replace ε\varepsilon with 1ε1 - \varepsilon.
  • Apply the α\alpha and weight-update formulas above, then renormalise the weights each round.
  • Labels are {1,+1}\{-1, +1\} (not {0,1}\{0, 1\}).

Notes

  • The +1010+10^{-10} inside α\alpha guards against division by zero when a stump is perfect (ε=0\varepsilon = 0); α\alpha then blows up and that stump dominates the vote.
  • Re-weighting is the heart of boosting: misclassified points become heavier, forcing later stumps to attack the hard cases instead of re-solving the easy ones.
Python
Loading...

This problem ships 4 hidden tests. They run in your browser via Pyodide — no backend, no submission queue. Press ▶ Run tests to execute.

  • Returns n_clf stumps, each with the expected keys and valid polarity
  • First stump separates the toy data on feature 0 at threshold 3
  • Boosted ensemble classifies separable training data perfectly
  • A perfect weak learner gets a large positive alpha