Decision tree learning (ID3)
Background
A decision tree classifies an example by asking a sequence of questions about its attributes. ID3 builds one greedily: at each node it picks the attribute with the highest information gain — the reduction in label entropy achieved by splitting on that attribute — then recurses on each branch. The result is a human-readable tree, the canonical interpretable model and the foundation of random forests and gradient boosting.
Problem statement
Implement learn_decision_tree(examples, attributes, target_attr) that builds an ID3 decision tree from categorical data. Using base-2 entropy of a label set and the information gain of attribute :
At each node: if all examples share a label, return it (a leaf); if no attributes remain, return the majority label; otherwise split on the attribute of maximum information gain and recurse on each of its values.
Input
examples—list[dict]: each dict maps attribute names to categorical values and includes the target attribute.attributes—list[str]: the attribute names available to split on (excludes the target).target_attr—str: the key holding the class label.
Output
Returns the tree as either a leaf (the class label, e.g. 'A') or an internal node — a nested dict {attribute: {value: subtree, ...}}. Returns the string 'No examples' when examples is empty.
Examples
Example 1
Input: examples = [
{'Color': 'Red', 'Shape': 'Round', 'Label': 'A'},
{'Color': 'Red', 'Shape': 'Square', 'Label': 'A'},
{'Color': 'Blue', 'Shape': 'Round', 'Label': 'B'},
{'Color': 'Blue', 'Shape': 'Square', 'Label': 'B'},
]
attributes = ['Color', 'Shape'], target_attr = 'Label'
Output: {'Color': {'Red': 'A', 'Blue': 'B'}}
Explanation: 'Color' has information gain — it perfectly separates A from B — while 'Shape' has gain , so 'Color' becomes the root. Each colour's examples are pure, so both branches collapse to leaf labels.
Constraints
- Use base-2 entropy; information gain is parent entropy minus the size-weighted average child entropy.
- Stopping rules, in order: empty examples →
'No examples'; all labels equal → that label; no attributes left → the majority label. - Branch on every value of the chosen attribute present in the current subset, and remove that attribute before recursing.
- Information-gain ties may be broken arbitrarily (the first max-gain attribute).
Notes
- ID3 handles categorical attributes and creates one branch per value — contrast with CART's numeric
<= thresholdbinary splits. - Each split consumes an attribute, so depth is bounded by the number of attributes; with no pruning, ID3 can overfit small or noisy data.
This problem ships 5 hidden tests. They run in your browser via Pyodide — no backend, no submission queue. Press ▶ Run tests to execute.
- •A perfectly predictive attribute becomes the root with pure leaves
- •All examples sharing a label return that label as a leaf
- •With no attributes left, return the majority class
- •Empty examples return the 'No examples' sentinel
- •Nested split: root is the most informative attribute