LSTMCell
ModuleLSTMCell(input_size: int, hidden_size: int, bias: bool = True, device: DeviceLike = None, dtype: DTypeLike = None)Single time-step Long Short-Term Memory (LSTM) cell.
Computes one recurrent update using the full LSTM gating equations:
The four weight matrices for the gates are stored as a single
vertically-stacked parameter of shape (4H, *), in gate order
[i; f; g; o] (i.e. the first H rows correspond to the input
gate, the next H to the forget gate, and so on).
Parameters
input_sizeinthidden_sizeintbiasbool= TrueFalse, all bias terms are omitted. Default: True.deviceDeviceLike= NonedtypeDTypeLike= NoneAttributes
weight_ihParameter, shape ``(4 * hidden_size, input_size)``weight_hhParameter, shape ``(4 * hidden_size, hidden_size)``bias_ihParameter or None, shape ``(4 * hidden_size,)``None when bias=False.bias_hhParameter or None, shape ``(4 * hidden_size,)``None when bias=False.Notes
- x:
(N, input_size)— batch of input vectors. - hx (optional): tuple
(h_0, c_0)each of shape(N, hidden_size). Defaults to zero tensors whenNone. - Output: tuple
(h_t, c_t)each of shape(N, hidden_size).
Weights are initialised from .
The forget-gate bias is not initialised to 1 by default
(unlike some implementations). If you observe vanishing gradients
on medium-length sequences, consider manually setting
cell.bias_ih.data[H:2H] = 1.0 after construction.
LSTM : Multi-layer, multi-step LSTM module. RNNCell : Simpler single-step cell without gating. GRUCell : Single-step GRU cell (fewer gates, no separate cell state).
Examples
Single-step update, carrying state across a manual loop:
>>> import lucid, lucid.nn as nn
>>> cell = nn.LSTMCell(input_size=10, hidden_size=20)
>>> h, c = lucid.zeros(3, 20), lucid.zeros(3, 20)
>>> x_seq = lucid.randn(7, 3, 10) # (L=7, N=3, I=10)
>>> for t in range(7):
... h, c = cell(x_seq[t], (h, c))
>>> h.shape, c.shape
((3, 20), (3, 20))
Without providing an explicit initial state (zeros used):
>>> cell2 = nn.LSTMCell(4, 8)
>>> h2, c2 = cell2(lucid.randn(5, 4))
>>> h2.shape
(5, 8)Methods (3)
__init__
→None__init__(input_size: int, hidden_size: int, bias: bool = True, device: DeviceLike = None, dtype: DTypeLike = None)Initialise the LSTMCell module. See the class docstring for parameter semantics.
forward
→Tensor or tuple of Tensorforward(x: Tensor, hx: tuple[Tensor, Tensor] | None = None)Run the recurrent forward pass.
Parameters
xTensorhxTensor= NoneReturns
Tensor or tuple of TensorOutput and (optionally) the new hidden state; see the class docstring.
extra_repr
→strextra_repr()Return a string representation of the layer's configuration.