nn.RNNCell

class lucid.nn.RNNCell(input_size: int, hidden_size: int, bias: bool = True, nonlinearity: Literal['tanh', 'relu'] = 'tanh')

The RNNCell module implements a single recurrent step. It processes one time step of input and combines it with the previous hidden state using a configurable activation function (tanh or relu). The cell supports both unbatched inputs ((input_size,)) and batched inputs ((batch_size, input_size)).

Class Signature

class lucid.nn.RNNCell(
    input_size: int,
    hidden_size: int,
    bias: bool = True,
    nonlinearity: Literal["tanh", "relu"] = "tanh",
)

Parameters

  • input_size (int): Number of expected features in the input x_t.

  • hidden_size (int): Number of features in the hidden state h_t.

  • bias (bool, optional): If True, adds learnable bias terms bias_ih and bias_hh. Default: True.

  • nonlinearity (Literal[“tanh”, “relu”], optional): Activation applied to the combined input and hidden projection. Default: “tanh”.

Attributes

  • weight_ih (Tensor): Input-to-hidden weight of shape (hidden_size, input_size).

  • weight_hh (Tensor): Hidden-to-hidden weight of shape (hidden_size, hidden_size).

  • bias_ih (Tensor or None): Bias added to the input projection. None when bias=False.

  • bias_hh (Tensor or None): Bias added to the hidden-state projection. None when bias=False.

  • nonlinearity (Module): Activation module instance (nn.Tanh or nn.ReLU) applied elementwise.

Forward Calculation

For an input \(x_t\) and previous hidden state \(h_{t-1}\), the cell computes:

\[h_t = \sigma(x_t W_{ih}^T + b_{ih} + h_{t-1} W_{hh}^T + b_{hh})\]

Where:

  • \(\sigma\) is either \(\tanh\) or \(\text{ReLU}\).

  • \(x_t\) has shape (input_size) or (batch_size, input_size).

  • \(h_{t-1}\) has shape (hidden_size) or (batch_size, hidden_size).

Handling Initial State

  • If hx is not provided, the hidden state is initialized to zeros on the same device and dtype as the input.

  • Inputs and hidden states can be 1D (unbatched) or 2D (batched). Shapes must agree on batch_size and hidden_size; otherwise, a ValueError is raised.

  • When receiving unbatched input, the returned hidden state is also unbatched (the batch dimension is squeezed out).

Examples

Single step without an initial hidden state:

>>> import lucid
>>> import lucid.nn as nn
>>> x_t = lucid.Tensor([0.5, -1.0, 0.3], requires_grad=True)  # Shape: (3,)
>>> cell = nn.RNNCell(input_size=3, hidden_size=2, nonlinearity="tanh")
>>> h_t = cell(x_t)  # hx defaults to zeros
>>> h_t.shape
(2,)

Iterating over a sequence manually:

>>> seq = lucid.Tensor([
...     [0.1, 0.2],
...     [0.0, -0.4],
...     [0.3, 0.5],
... ], requires_grad=True)
>>> cell = nn.RNNCell(input_size=2, hidden_size=3, nonlinearity="relu")
>>> h = None
>>> for x_t in seq:
...     h = cell(x_t, h)
>>> h  # Final hidden state after the last time step
Tensor([...], grad=None)

Batched inputs with an explicit initial state:

>>> batch = lucid.Tensor([[1.0, -0.2], [0.4, 0.6]], requires_grad=True)  # Shape: (2, 2)
>>> h0 = lucid.zeros(2, 4)  # Shape: (batch_size, hidden_size)
>>> cell = nn.RNNCell(input_size=2, hidden_size=4, bias=False)
>>> h1 = cell(batch, h0)
>>> h1.shape
(2, 4)