Mutual information
Background
Mutual information quantifies how much knowing one variable reduces uncertainty about another — the information they share, in bits. It is zero exactly when and are independent, and equals when fully determines . It is widely used for feature selection and as a model-agnostic measure of dependence (it captures nonlinear relationships that correlation misses).
Problem statement
Implement mutual_information(x, y) from paired discrete samples:
Estimate , , from empirical frequencies; sum only over pairs with . Return bits.
Input
x,y— 1-D arrays of the same length with discrete (hashable) values.
Output
A float: the mutual information in bits ().
Examples
Example 1
Input: x = [0, 0, 1, 1], y = [0, 1, 0, 1]
Output: 0.0
Explanation: every (x, y) pair occurs once, so for all pairs — x and y are independent and share no information.
Constraints
- Use empirical frequencies for the joint and both marginals.
- Sum over pairs with positive joint probability only (skip ).
- Use for an answer in bits.
Notes
- When , mutual information equals the entropy — they share everything.
- MI is symmetric, , and nonnegative; it generalizes correlation to any (including nonlinear) dependence.
This problem ships 5 hidden tests. They run in your browser via Pyodide — no backend, no submission queue. Press ▶ Run tests to execute.
- •Independent variables share ~0 information
- •Y = X gives MI equal to H(X)
- •Deterministic relation: MI equals H(Y)
- •Mutual information is non-negative
- •Symmetric in its arguments