class

RNNCell

extendsModule

RNNCell(input_size: int, hidden_size: int, bias: bool = True, nonlinearity: str = 'tanh', device: DeviceLike = None, dtype: DTypeLike = None)

source

Single time-step Elman RNN cell.

Computes one recurrent update for a single time step:

h_t = \phi\!\left(W_{ih}\,x_t + b_{ih} + W_{hh}\,h_{t-1} + b_{hh}\right)

where $\phi$ is either $\tanh$ (default) or $\text{ReLU}$ , selected by nonlinearity.

Use this cell directly when you need fine-grained control over the time loop (e.g. teacher forcing, attention, custom stopping criteria). For standard sequence processing prefer RNN.

Parameters

input_sizeint

Number of features in the input vector

x_t

hidden_sizeint

Number of features in the hidden state

h_t

biasbool= True

If False, no bias terms are used. Default: True.

nonlinearity(tanh, relu)= 'tanh'

Activation function applied to the pre-activation. Default: 'tanh'.

deviceDeviceLike= None

Device for weight allocation.

dtypeDTypeLike= None

Data type for weight tensors.

Attributes

weight_ihParameter, shape ``(hidden_size, input_size)``

Input–hidden weight matrix

W_{ih}

weight_hhParameter, shape ``(hidden_size, hidden_size)``

Hidden–hidden weight matrix

W_{hh}

bias_ihParameter or None, shape ``(hidden_size,)``

Input–hidden bias

b_{ih}

. None when bias=False.

bias_hhParameter or None, shape ``(hidden_size,)``

Hidden–hidden bias

b_{hh}

. None when bias=False.

Notes

x: (N, input_size) — batch of input vectors.
hx (optional): (N, hidden_size) — initial hidden state. Defaults to zeros when None.
Output h_t: (N, hidden_size).

Weights are initialised from $\mathcal{U}(-1/\sqrt{H},\, 1/\sqrt{H})$ where $H$ = hidden_size.

For long sequences, vanilla RNNs suffer from the vanishing-gradient problem because gradients are multiplied by $W_{hh}$ at every step. Prefer LSTMCell or GRUCell for sequences longer than ~20 steps.

RNN : Multi-layer, multi-step wrapper around RNNCell. LSTMCell : Single-step LSTM cell (gated, better for long sequences). GRUCell : Single-step GRU cell.

Examples

Manual time loop with ``tanh`` nonlinearity:
>>> import lucid, lucid.nn as nn
>>> cell = nn.RNNCell(input_size=8, hidden_size=16)
>>> x_seq = lucid.randn(5, 3, 8)   # (L=5, N=3, I=8)
>>> h = lucid.zeros(3, 16)
>>> for t in range(5):
...     h = cell(x_seq[t], h)      # (N=3, H=16)
>>> h.shape
(3, 16)
Using ``relu`` nonlinearity (avoids gradient saturation at extremes):
>>> cell_relu = nn.RNNCell(8, 16, nonlinearity='relu')
>>> h2 = cell_relu(lucid.randn(4, 8))
>>> h2.shape
(4, 16)

Methods (3)

dunder

init

→None

__init__(input_size: int, hidden_size: int, bias: bool = True, nonlinearity: str = 'tanh', device: DeviceLike = None, dtype: DTypeLike = None)

source

Initialise the RNNCell module. See the class docstring for parameter semantics.

forward

→Tensor or tuple of Tensor

forward(x: Tensor, hx: Tensor | None = None)

source

Run the recurrent forward pass.

Parameters

xTensor

See the class docstring.

hxTensor= None

See the class docstring.

Returns

Tensor or tuple of Tensor

Output and (optionally) the new hidden state; see the class docstring.

extra_repr

→str

extra_repr()

source

Return a string representation of the layer's configuration.