Gaussian process regression (RBF)
Background
A Gaussian Process is a distribution over functions: rather than fitting parameters, it places a prior over functions through a kernel (covariance) and conditions on observed data to obtain a posterior. GP regression predicts, at any test input, a Gaussian with a mean and a variance — giving not just a value but calibrated uncertainty. It is a go-to model for small-data regression and Bayesian optimisation.
Problem statement
Implement gaussian_process_predict(X_train, y_train, X_test, length_scale, sigma, noise) that returns the GP posterior mean and standard deviation at the test points, using the RBF (squared-exponential) kernel:
With , , :
Return and the per-point standard deviations .
Input
X_train—np.ndarray(n, d): training inputs.y_train—np.ndarray(n,): training targets.X_test—np.ndarray(m, d): test inputs.length_scale—float: RBF length scale .sigma—float: kernel output scale.noise—float: observation-noise variance added to the diagonal of .
Output
Returns a tuple (mu, std): mu is np.ndarray (m,) of posterior means and std is np.ndarray (m,) of posterior standard deviations.
Examples
Example 1
Input: X_train=[[0],[2],[4]], y_train=[0,1,0], X_test=[[0],[100]],
length_scale=1.0, sigma=1.0, noise=1e-8
Output: mu ~= [0.0, 0.0], std ~= [~0.0, 1.0]
Explanation: at (a training point) the mean equals the observed with near-zero uncertainty; at , far from all data, the RBF covariance vanishes, so the prediction reverts to the prior mean with the prior std .
Constraints
- Use the RBF kernel; add
noiseto the diagonal of the train–train covariance before inverting. - Posterior mean is ; the variance is the diagonal of .
- Return per-point standard deviations (square root of the variance, clipped at ).
- Pure numpy (
np.linalg.inv/np.linalg.solve); tests compare withatol=1e-3.
Notes
- As
noise -> 0the GP interpolates the training data exactly (the mean passes through every observation); largernoisesmooths the fit. - The length scale controls wiggliness: small gives fast-varying functions, large gives smooth, slowly varying ones.
This problem ships 4 hidden tests. They run in your browser via Pyodide — no backend, no submission queue. Press ▶ Run tests to execute.
- •Posterior mean interpolates the training targets at training inputs
- •Far from the data, mean reverts to 0 and std to the prior sigma
- •Returns mean and std, one entry per test point
- •Predictive standard deviation is non-negative