Mutual informationMedium

Mutual information

Background

Mutual information I(X;Y)I(X;Y) quantifies how much knowing one variable reduces uncertainty about another — the information they share, in bits. It is zero exactly when XX and YY are independent, and equals H(X)H(X) when YY fully determines XX. It is widely used for feature selection and as a model-agnostic measure of dependence (it captures nonlinear relationships that correlation misses).

Problem statement

Implement mutual_information(x, y) from paired discrete samples:

I(X;Y)=a,bp(a,b)log2p(a,b)p(a)p(b)I(X;Y) = \sum_{a,b} p(a,b)\,\log_2\frac{p(a,b)}{p(a)\,p(b)}

Estimate p(a,b)p(a,b), p(a)p(a), p(b)p(b) from empirical frequencies; sum only over pairs with p(a,b)>0p(a,b)>0. Return bits.

Input

  • x, y — 1-D arrays of the same length with discrete (hashable) values.

Output

A float: the mutual information in bits (0\ge 0).

Examples

Example 1

Input:  x = [0, 0, 1, 1], y = [0, 1, 0, 1]
Output: 0.0

Explanation: every (x, y) pair occurs once, so p(a,b)=0.25=p(a)p(b)p(a,b)=0.25=p(a)p(b) for all pairs — x and y are independent and share no information.

Constraints

  • Use empirical frequencies for the joint and both marginals.
  • Sum over pairs with positive joint probability only (skip p(a,b)=0p(a,b)=0).
  • Use log2\log_2 for an answer in bits.

Notes

  • When Y=XY = X, mutual information equals the entropy H(X)H(X) — they share everything.
  • MI is symmetric, I(X;Y)=I(Y;X)I(X;Y)=I(Y;X), and nonnegative; it generalizes correlation to any (including nonlinear) dependence.
Python
Loading...

This problem ships 5 hidden tests. They run in your browser via Pyodide — no backend, no submission queue. Press ▶ Run tests to execute.

  • Independent variables share ~0 information
  • Y = X gives MI equal to H(X)
  • Deterministic relation: MI equals H(Y)
  • Mutual information is non-negative
  • Symmetric in its arguments