Swish / SiLU activationEasynumpyactivationneural-netsiluswish
Swish / SiLU activation
Background
Swish (a.k.a. SiLU, ) is a smooth, non-monotonic activation found by neural-architecture search; it slightly outperforms ReLU in deep networks and is the gate inside SwiGLU. Unlike ReLU it has a small negative response and a continuous derivative everywhere.
Problem statement
Implement swish(x, beta=1.0):
Input
x—np.ndarray: input (any shape).beta—float: gate sharpness (default 1.0 = SiLU).
Output
Returns an np.ndarray of the same shape.
Examples
Example 1
Input: x = [-1, 0, 1, 2], beta = 1.0
Output: [-0.2689, 0.0, 0.7311, 1.7616]
Explanation: at , swish ; at , ; the negative input gives a small negative output, .
Constraints
- swish , applied elementwise.
- approaches a linear ; large approaches ReLU.
- Tests compare with
atol=1e-4.
Notes
- swish, and it is unbounded above but bounded below (a small negative dip near ), which helps gradient flow compared with ReLU's hard zero.
- SiLU is the special case used in EfficientNet and as the SwiGLU gate.
Python
Loading...
This problem ships 4 hidden tests. They run in your browser via Pyodide — no backend, no submission queue. Press ▶ Run tests to execute.
- •Reference values
- •swish(0) = 0
- •Matches x * sigmoid(beta*x)
- •Large beta approaches ReLU on positives