RNNCell
ModuleRNNCell(input_size: int, hidden_size: int, bias: bool = True, nonlinearity: str = 'tanh', device: DeviceLike = None, dtype: DTypeLike = None)Single time-step Elman RNN cell.
Computes one recurrent update for a single time step:
where is either (default) or
, selected by nonlinearity.
Use this cell directly when you need fine-grained control over the
time loop (e.g. teacher forcing, attention, custom stopping
criteria). For standard sequence processing prefer RNN.
Parameters
input_sizeinthidden_sizeintbiasbool= TrueFalse, no bias terms are used. Default: True.nonlinearity(tanh, relu)= 'tanh''tanh'.deviceDeviceLike= NonedtypeDTypeLike= NoneAttributes
weight_ihParameter, shape ``(hidden_size, input_size)``weight_hhParameter, shape ``(hidden_size, hidden_size)``bias_ihParameter or None, shape ``(hidden_size,)``None when bias=False.bias_hhParameter or None, shape ``(hidden_size,)``None when bias=False.Notes
- x:
(N, input_size)— batch of input vectors. - hx (optional):
(N, hidden_size)— initial hidden state. Defaults to zeros whenNone. - Output
h_t:(N, hidden_size).
Weights are initialised from
where
= hidden_size.
For long sequences, vanilla RNNs suffer from the vanishing-gradient
problem because gradients are multiplied by at every
step. Prefer LSTMCell or GRUCell for sequences
longer than ~20 steps.
RNN : Multi-layer, multi-step wrapper around RNNCell.
LSTMCell : Single-step LSTM cell (gated, better for long sequences).
GRUCell : Single-step GRU cell.
Examples
Manual time loop with ``tanh`` nonlinearity:
>>> import lucid, lucid.nn as nn
>>> cell = nn.RNNCell(input_size=8, hidden_size=16)
>>> x_seq = lucid.randn(5, 3, 8) # (L=5, N=3, I=8)
>>> h = lucid.zeros(3, 16)
>>> for t in range(5):
... h = cell(x_seq[t], h) # (N=3, H=16)
>>> h.shape
(3, 16)
Using ``relu`` nonlinearity (avoids gradient saturation at extremes):
>>> cell_relu = nn.RNNCell(8, 16, nonlinearity='relu')
>>> h2 = cell_relu(lucid.randn(4, 8))
>>> h2.shape
(4, 16)Methods (3)
__init__
→None__init__(input_size: int, hidden_size: int, bias: bool = True, nonlinearity: str = 'tanh', device: DeviceLike = None, dtype: DTypeLike = None)Initialise the RNNCell module. See the class docstring for parameter semantics.
forward
→Tensor or tuple of Tensorforward(x: Tensor, hx: Tensor | None = None)Run the recurrent forward pass.
Parameters
xTensorhxTensor= NoneReturns
Tensor or tuple of TensorOutput and (optionally) the new hidden state; see the class docstring.
extra_repr
→strextra_repr()Return a string representation of the layer's configuration.