Group normalizationMedium

Group normalization

Background

Group Normalization splits a layer's channels into groups and normalizes each group's activations per sample, then applies a per-channel affine. Unlike batch norm it doesn't depend on the batch dimension, so it works with tiny batches (detection, segmentation) where batch-norm statistics are unreliable.

Problem statement

Implement group_normalization(X, gamma, beta, num_groups, epsilon=1e-5) for X of shape (B, C, H, W). Split the CC channels into num_groups groups; for each (sample, group) normalize over the group's channels and spatial dimensions, then apply the per-channel scale/shift:

X^=Xμgσg2+ϵ,Y=γX^+β\hat{X} = \frac{X - \mu_g}{\sqrt{\sigma_g^2 + \epsilon}}, \qquad Y = \gamma\,\hat{X} + \beta

where μg,σg2\mu_g, \sigma_g^2 are the mean/variance over each group's (group_size×H×W)(\text{group\_size}\times H\times W) elements.

Input

  • Xnp.ndarray (B, C, H, W).
  • gamma, betanp.ndarray (C,): per-channel scale and shift.
  • num_groupsint: must divide C.
  • epsilonfloat.

Output

Returns an np.ndarray (B, C, H, W).

Examples

Example 1

Input:  X = [[[[1, 2]], [[3, 4]]]]  (shape 1x2x1x2), gamma = [1,1], beta = [0,0], num_groups = 1
Output: [[[[-1.3416, -0.4472]], [[0.4472, 1.3416]]]]

Explanation: with one group, all four values {1,2,3,4}\{1,2,3,4\} are normalized together (mean 2.5, std 1.118\approx1.118), then the identity affine is applied.

Constraints

  • Reshape to (B, num_groups, group_size, H, W) and reduce mean/var over axes (2, 3, 4).
  • Apply per-channel gamma/beta (broadcast as (1, C, 1, 1)) after reshaping back.
  • num_groups divides C; tests compare with atol=1e-4.

Notes

  • num_groups = 1 is Layer Norm (all channels together); num_groups = C is Instance Norm (each channel alone) — GroupNorm interpolates between them.
  • Statistics are computed per sample, so GroupNorm behaves identically at train and test time, unlike BatchNorm.
Python
Loading...

This problem ships 4 hidden tests. They run in your browser via Pyodide — no backend, no submission queue. Press ▶ Run tests to execute.

  • Reference example
  • Each group is normalized to ~0 mean and ~1 var
  • Output shape matches input
  • Affine: gamma scales and beta shifts