Covariance matrix
Background
The covariance matrix summarises how a set of variables vary together: entry is the covariance between feature and feature , and the diagonal holds each feature's variance. It is the foundation of PCA, Gaussian models, whitening, and the Mahalanobis distance.
Problem statement
Implement calculate_covariance_matrix(vectors) where vectors[i] is the list of observations for feature . Return the symmetric matrix of sample covariances:
where is the number of observations per feature and is feature 's mean.
Input
vectors—list[list[float]]of shape(n_features, n_observations): each inner list holds one feature's values across all observations.
Output
Returns a list[list[float]] of shape (n_features, n_features): the symmetric sample covariance matrix.
Examples
Example 1
Input: [[1, 2, 3], [4, 5, 6]]
Output: [[1.0, 1.0], [1.0, 1.0]]
Explanation: feature 0 has mean 2 and feature 1 has mean 5. Each variance is , and the covariance is .
Constraints
- Use the sample covariance: divide by , where is the number of observations.
- The matrix is symmetric: .
- Every
vectors[i]has the same length .
Notes
- Mind the orientation: rows are features, columns are observations — the transpose of the usual
(n_samples, n_features)design matrix (this matchesnp.covwithrowvar=True). - Dividing by (Bessel's correction) gives an unbiased estimate of the population covariance from a sample.
This problem ships 4 hidden tests. They run in your browser via Pyodide — no backend, no submission queue. Press ▶ Run tests to execute.
- •Reference example: [[1,2,3],[4,5,6]] -> [[1,1],[1,1]]
- •Matches numpy's covariance (rows = variables, ddof=1)
- •Result is square and symmetric
- •Diagonal entries equal the per-feature sample variance