Pearson correlation coefficientEasy

Pearson correlation coefficient

Background

The Pearson correlation coefficient rr measures the strength and direction of the linear relationship between two variables, normalised to [1,1][-1, 1]: +1+1 is a perfect increasing line, 1-1 a perfect decreasing line, and 00 means no linear relationship. It is the covariance divided by the product of the two standard deviations.

Problem statement

Implement pearson_correlation(x, y) returning the Pearson correlation between two equal-length vectors:

r=i(xixˉ)(yiyˉ)i(xixˉ)2  i(yiyˉ)2r = \frac{\sum_i (x_i - \bar{x})(y_i - \bar{y})}{\sqrt{\sum_i (x_i - \bar{x})^2}\;\sqrt{\sum_i (y_i - \bar{y})^2}}

Input

  • xnp.ndarray of shape (n,).
  • ynp.ndarray of shape (n,).

Output

Returns a float in [1,1][-1, 1].

Examples

Example 1

Input:  x = [1, 2, 3, 4], y = [2, 4, 6, 8]
Output: 1.0

Explanation: y=2xy = 2x is a perfect increasing linear relationship, so r=1r = 1.

Constraints

  • Centre both vectors by their means before forming the products.
  • Divide by the product of the centred-vector norms; the scale factors cancel, so sample-vs-population normalisation does not matter.
  • Return a float; the result lies in [1,1][-1, 1].

Notes

  • Pearson rr captures only linear dependence — variables can be strongly related (e.g. y=x2y = x^2) yet have r0r \approx 0.
  • rr is the cosine of the angle between the mean-centred vectors, which is why it is invariant to shifting and positive scaling of either variable.
Python
Loading...

This problem ships 4 hidden tests. They run in your browser via Pyodide — no backend, no submission queue. Press ▶ Run tests to execute.

  • Perfect positive linear relationship -> 1.0
  • Perfect negative linear relationship -> -1.0
  • Matches numpy's corrcoef
  • Result lies in [-1, 1]