class

Mish

extendsModule
Mish()
source

Mish activation function.

Applies element-wise:

Mish(x)=xtanh ⁣(Softplus(x))=xtanh ⁣(ln(1+ex))\text{Mish}(x) = x \cdot \tanh\!\bigl(\text{Softplus}(x)\bigr) = x \cdot \tanh\!\bigl(\ln(1 + e^x)\bigr)

Mish is smooth, non-monotone, and unbounded above while being bounded below (approaching zero for large negative inputs). It preserves small negative values — unlike ReLU — and empirically outperforms Swish/SiLU on several object detection benchmarks (YOLOv4, YOLOv5).

Notes

  • Input: ()(*) — any shape.
  • Output: ()(*) — same shape as input.

Mish requires computing both a softplus and a tanh, making it slightly more expensive than ReLU or SiLU. The smooth gradient landscape can aid optimisation in very deep networks.

Examples

>>> import lucid
>>> import lucid.nn as nn
>>> m = nn.Mish()
>>> x = lucid.tensor([-2.0, -1.0, 0.0, 1.0, 2.0])
>>> m(x)
tensor([-0.1876, -0.3034,  0.    ,  0.8651,  1.9440])
>>> # Drop-in for SiLU in detection backbones
>>> x = lucid.randn(4, 128, 7, 7)
>>> out = m(x)
>>> out.shape
(4, 128, 7, 7)

Methods (1)

fn

forward

Tensor
forward(x: Tensor)
source

Apply the activation function element-wise.

Parameters

inputTensor
Input tensor of arbitrary shape.

Returns

Tensor

Output tensor of the same shape as input.