GMM E-step (responsibilities)Medium

GMM E-step (responsibilities)

Background

A Gaussian Mixture Model assumes the data was generated by KK Gaussian components mixed with weights πk\pi_k. EM alternates two steps to fit it; this is the E-step, which computes each point's responsibility γik\gamma_{ik} — the posterior probability that component kk generated point ii — given the current parameters. These soft assignments are GMM's generalisation of k-means' hard assignments.

Problem statement

Implement gmm_e_step(X, weights, means, covariances) that returns the responsibility matrix:

γik=πkN(xiμk,Σk)j=1KπjN(xiμj,Σj)\gamma_{ik} = \frac{\pi_k\, \mathcal{N}(x_i \mid \mu_k, \Sigma_k)}{\sum_{j=1}^{K} \pi_j\, \mathcal{N}(x_i \mid \mu_j, \Sigma_j)}

using the multivariate normal density:

N(xμ,Σ)=1(2π)DΣexp ⁣(12(xμ)Σ1(xμ))\mathcal{N}(x \mid \mu, \Sigma) = \frac{1}{\sqrt{(2\pi)^{D}\, \lvert\Sigma\rvert}}\, \exp\!\Big(-\tfrac{1}{2}(x-\mu)^\top \Sigma^{-1} (x-\mu)\Big)

Input

  • Xnp.ndarray of shape (N, D): the data points.
  • weightsnp.ndarray of shape (K,): mixing weights πk\pi_k (sum to 1).
  • meansnp.ndarray of shape (K, D): the component means.
  • covariancesnp.ndarray of shape (K, D, D): the component covariance matrices.

Output

Returns an np.ndarray of shape (N, K): each row is a probability distribution over the KK components (its entries sum to 1).

Examples

Example 1

Input:  X = [[0,0],[5,5]], weights = [0.5,0.5],
        means = [[0,0],[5,5]], covariances = [I, I]
Output: ~[[1, 0], [0, 1]]

Explanation: the first point sits on component 0's mean and is far from component 1, so almost all its responsibility goes to component 0; the second point is the mirror image.

Constraints

  • Use the full multivariate-normal density with each component's own covariance.
  • Each output row must sum to 11 — normalise the weighted densities across components.
  • Shapes: X is (N, D); the output is (N, K).
  • Tests compare with atol=1e-6.

Notes

  • The E-step yields soft assignments; the M-step (not asked here) re-estimates π,μ,Σ\pi, \mu, \Sigma from these responsibilities, and alternating the two is the EM algorithm.
  • As the covariances shrink toward zero the responsibilities harden to 0/1 and GMM collapses to k-means.
Python
Loading...

This problem ships 4 hidden tests. They run in your browser via Pyodide — no backend, no submission queue. Press ▶ Run tests to execute.

  • Responsibilities form a valid distribution per point
  • A point at a component mean is assigned almost entirely to it
  • Equidistant point with equal components splits 50/50
  • Higher mixing weight shifts responsibility toward that component