class

RNNCell

extendsModule
RNNCell(input_size: int, hidden_size: int, bias: bool = True, nonlinearity: str = 'tanh', device: DeviceLike = None, dtype: DTypeLike = None)
source

Single time-step Elman RNN cell.

Computes one recurrent update for a single time step:

ht=ϕ ⁣(Wihxt+bih+Whhht1+bhh)h_t = \phi\!\left(W_{ih}\,x_t + b_{ih} + W_{hh}\,h_{t-1} + b_{hh}\right)

where ϕ\phi is either tanh\tanh (default) or ReLU\text{ReLU}, selected by nonlinearity.

Use this cell directly when you need fine-grained control over the time loop (e.g. teacher forcing, attention, custom stopping criteria). For standard sequence processing prefer RNN.

Parameters

input_sizeint
Number of features in the input vector xtx_t.
hidden_sizeint
Number of features in the hidden state hth_t.
biasbool= True
If False, no bias terms are used. Default: True.
nonlinearity(tanh, relu)= 'tanh'
Activation function applied to the pre-activation. Default: 'tanh'.
deviceDeviceLike= None
Device for weight allocation.
dtypeDTypeLike= None
Data type for weight tensors.

Attributes

weight_ihParameter, shape ``(hidden_size, input_size)``
Input–hidden weight matrix WihW_{ih}.
weight_hhParameter, shape ``(hidden_size, hidden_size)``
Hidden–hidden weight matrix WhhW_{hh}.
bias_ihParameter or None, shape ``(hidden_size,)``
Input–hidden bias bihb_{ih}. None when bias=False.
bias_hhParameter or None, shape ``(hidden_size,)``
Hidden–hidden bias bhhb_{hh}. None when bias=False.

Notes

  • x: (N, input_size) — batch of input vectors.
  • hx (optional): (N, hidden_size) — initial hidden state. Defaults to zeros when None.
  • Output h_t: (N, hidden_size).

Weights are initialised from U(1/H,1/H)\mathcal{U}(-1/\sqrt{H},\, 1/\sqrt{H}) where HH = hidden_size.

For long sequences, vanilla RNNs suffer from the vanishing-gradient problem because gradients are multiplied by WhhW_{hh} at every step. Prefer LSTMCell or GRUCell for sequences longer than ~20 steps.

RNN : Multi-layer, multi-step wrapper around RNNCell. LSTMCell : Single-step LSTM cell (gated, better for long sequences). GRUCell : Single-step GRU cell.

Examples

Manual time loop with ``tanh`` nonlinearity:
>>> import lucid, lucid.nn as nn
>>> cell = nn.RNNCell(input_size=8, hidden_size=16)
>>> x_seq = lucid.randn(5, 3, 8)   # (L=5, N=3, I=8)
>>> h = lucid.zeros(3, 16)
>>> for t in range(5):
...     h = cell(x_seq[t], h)      # (N=3, H=16)
>>> h.shape
(3, 16)
Using ``relu`` nonlinearity (avoids gradient saturation at extremes):
>>> cell_relu = nn.RNNCell(8, 16, nonlinearity='relu')
>>> h2 = cell_relu(lucid.randn(4, 8))
>>> h2.shape
(4, 16)

Methods (3)

dunder

__init__

None
__init__(input_size: int, hidden_size: int, bias: bool = True, nonlinearity: str = 'tanh', device: DeviceLike = None, dtype: DTypeLike = None)
source

Initialise the RNNCell module. See the class docstring for parameter semantics.

fn

forward

Tensor or tuple of Tensor
forward(x: Tensor, hx: Tensor | None = None)
source

Run the recurrent forward pass.

Parameters

xTensor
See the class docstring.
hxTensor= None
See the class docstring.

Returns

Tensor or tuple of Tensor

Output and (optionally) the new hidden state; see the class docstring.

fn

extra_repr

str
extra_repr()
source

Return a string representation of the layer's configuration.