Pairwise cosine-similarity matrix
Background
Cosine similarity measures the angle between two vectors, ignoring their magnitudes: for the same direction, for orthogonal, for opposite. It is the similarity for embeddings — semantic search, RAG retrieval, recommendation, and clustering all rank candidates by cosine similarity to a query.
Problem statement
Implement cosine_similarity_matrix(A, B) returning the matrix S of cosine similarities between every row of A and every row of B:
A zero vector has no direction, so its similarity to anything is defined as .
Input
A—np.ndarrayof shape(n, d).B—np.ndarrayof shape(m, d).
Output
Returns an np.ndarray of shape (n, m) where S[i, j] is the cosine similarity of A[i] and B[j].
Examples
Example 1
Input: A = [[1, 0], [0, 1]], B = [[1, 1], [1, 0]]
Output: [[0.7071, 1.0], [0.7071, 0.0]]
Explanation: vs is and vs is ; vs is and vs is (orthogonal).
Constraints
- Normalise each row to unit length, then take the dot products (equivalently divide each by the two norms).
- A zero-norm row contributes similarity — do not divide by zero.
- Output shape
(n, m); values in ; tests compare withatol=1e-6.
Notes
- Pre-normalising the rows turns the whole computation into one matrix product — the efficient way to score a query against a large corpus of embeddings.
- Cosine ignores magnitude, so it is robust to document length and embedding scale; Euclidean distance on normalised vectors is monotonic with cosine.
This problem ships 4 hidden tests. They run in your browser via Pyodide — no backend, no submission queue. Press ▶ Run tests to execute.
- •Reference example
- •Self-similarity has a unit diagonal for nonzero rows
- •Opposite directions give -1, orthogonal give 0
- •Zero vector yields zero similarity (no NaN)