class

Tanh

extendsModule

Tanh()

source edit

Implementing kernel

C++TanhBackwardclass

Hyperbolic tangent activation function.

Applies element-wise:

\tanh(x) = \frac{e^x - e^{-x}}{e^x + e^{-x}}

Maps all real inputs to the open interval $(-1, 1)$ . Zero-centred, unlike sigmoid, which makes it the preferred gate/output activation in many recurrent architectures (LSTM, GRU).

Notes

Input: $(*)$ — any shape.
Output: $(*)$ — same shape as input, values in $(-1, 1)$ .

Like sigmoid, tanh saturates for large inputs. Its zero-centred output reduces the bias shift problem in successive layers, but the vanishing gradient issue still applies for very deep networks.

Examples

>>> import lucid
>>> import lucid.nn as nn
>>> m = nn.Tanh()
>>> x = lucid.tensor([-2.0, -1.0, 0.0, 1.0, 2.0])
>>> m(x)
tensor([-0.9640, -0.7616,  0.    ,  0.7616,  0.9640])
>>> # Hidden state output in a simple recurrent cell
>>> h = lucid.randn(32, 128)
>>> h_next = m(h)
>>> h_next.shape
(32, 128)

Used by 1

lucid.nn.modules

Instance methods

forward

→Tensor

forward(x: Tensor)

source edit

Apply the activation function element-wise.

Parameters

inputTensor

Input tensor of arbitrary shape.

Returns

Tensor

Output tensor of the same shape as input.

>>> import lucid >>> import lucid.nn as nn >>> m = nn.Tanh() >>> x = lucid.tensor([-2.0, -1.0, 0.0, 1.0, 2.0]) >>> m(x) tensor([-0.9640, -0.7616, 0. , 0.7616, 0.9640]) >>> # Hidden state output in a simple recurrent cell >>> h = lucid.randn(32, 128) >>> h_next = m(h) >>> h_next.shape (32, 128)