Basic autograd operations
Background
Automatic differentiation is the engine behind every deep-learning framework. Each value remembers the operation that produced it and how to push gradients back to its inputs. Running these local rules in reverse topological order — reverse-mode autodiff (backpropagation) — computes the gradient of a final output with respect to every intermediate value in a single backward pass. This problem builds a minimal scalar autograd engine (in the spirit of Karpathy's micrograd).
Problem statement
Implement a Value class supporting +, *, and relu, plus a backward() method.
For each operation, store a local backward rule that accumulates into the inputs' .grad:
backward() builds a topological order of the graph, seeds the output gradient to 1, and calls each node's local rule in reverse.
Input
A graph of Value objects built with Python operators, e.g. d = a + b * c, followed by a call to .backward() on the output node.
Output
After backward(), every Value in the graph has its .grad set to .
Examples
Example 1
a = Value(2); b = Value(-3); c = Value(10)
d = a + b * c # d.data = 2 + (-30) = -28
e = d.relu() # e.data = 0 (since -28 < 0)
e.backward()
# a.grad = 0, b.grad = 0, c.grad = 0
Explanation: relu of a negative number outputs 0 and has zero local gradient, so no gradient flows back — every input gradient is 0.
Constraints
- Gradients must accumulate (
+=), so a value used in multiple places sums its contributions. relu's local gradient is 1 where the input is positive, else 0.backward()must visit nodes in reverse topological order after seeding the output'sgrad = 1.
Notes
- Accumulation (
+=not=) is what makesx + xproducex.grad == 2— both edges contribute. - The same topological-sort-then-reverse pattern scales to tensors; only the per-op local gradients change.
This problem ships 4 hidden tests. They run in your browser via Pyodide — no backend, no submission queue. Press ▶ Run tests to execute.
- •Reference example: ReLU on a negative value zeros all gradients
- •Product-plus-self gradient: z = x*y + x
- •ReLU forward and gradient for a positive input
- •Gradient accumulates when a value is reused: y = x + x