LoRA forward passEasynumpylorafine-tuningpeftllm
LoRA forward pass
Background
LoRA (Low-Rank Adaptation) fine-tunes a large model cheaply by freezing the original weight matrix and learning only a small low-rank update . Because and with tiny, the number of trainable parameters drops by orders of magnitude, yet the adapted layer can still shift behavior significantly.
Problem statement
Implement lora_forward(x, W0, A, B, alpha) for the adapted linear layer:
where is the LoRA rank (A.shape[0]) and is the scaling factor.
Input
x— input, shape(in,)or(batch, in).W0— frozen base weights, shape(out, in).A— LoRA down-projection, shape(r, in).B— LoRA up-projection, shape(out, r).alpha—float, LoRA scaling numerator.
Output
An np.ndarray of shape (out,) for a 1-D input or (batch, out) for a batched input.
Examples
Example 1
Input: x = [1, 0], W0 = [[1,0],[0,1]], A = [[1,0]], B = [[1],[0]], alpha = 2
Output: [3, 0]
Explanation: , scaling . , so . Then .
Constraints
- The scaling factor is where .
- has the same shape as (
out × in). - Support both a single vector and a batch of inputs.
Notes
- At inference can be folded into a single matrix, so LoRA adds zero latency once merged.
- Only and are trained; keeping frozen is what makes adapting a 70B model feasible on modest hardware.
Python
Loading...
This problem ships 5 hidden tests. They run in your browser via Pyodide — no backend, no submission queue. Press ▶ Run tests to execute.
- •Reference example
- •Zero B reduces to the base layer
- •Scaling is alpha / r
- •Batched input
- •Matches explicit W0 + scaling*BA