class

Module

Module()

source edit

Base class for all neural network modules.

Every custom model should subclass this and implement forward. Submodules assigned as attributes are tracked automatically.

Notes

Attribute routing: Setting an attribute follows this priority order:

If the value is a lucid.nn.Parameter → stored in _parameters.
If the value is a Module → stored in _modules.
Otherwise → plain Python attribute.

To register a non-parameter tensor (e.g. a running mean), call register_buffer explicitly.

Examples

>>> class MLP(nn.Module):
...     def __init__(self):
...         super().__init__()
...         self.fc1 = nn.Linear(10, 20)
...         self.fc2 = nn.Linear(20, 1)
...
...     def forward(self, x):
...         return self.fc2(lucid.relu(self.fc1(x)))
...
>>> model = MLP()
>>> model(lucid.randn(4, 10)).shape
(4, 1)

Used by 34

… 22 more

Constructors

dunder

init

→None

__init__()

source edit

Initialise the instance. See the class docstring for parameter semantics.

dunder

call

→_ModuleOutput

__call__(args: Tensor = (), kwargs: object = {})

source edit

Forward to the underlying callable (see class docstring).

Instance methods

add_module

→None

add_module(name: str, module: Module | None)

source edit

Add a child module.

apply

→Self

apply(fn: Callable[[Module], None])

source edit

Apply fn recursively to every submodule (including self).

bfloat16

→Self

bfloat16()

source edit

Cast all parameters and buffers to bfloat16.

buffers

→Iterator[Tensor]

buffers(recurse: bool = True)

source edit

Yield all buffer tensors.

children

→Iterator[Module]

children()

source edit

Yield direct child modules.

compile

→Self

compile(args: object = (), kwargs: object = {})

source edit

No-op compatibility stub.

External codepaths often call model.compile() to opt into JIT acceleration; Lucid has no such layer, so this returns self unchanged rather than crashing the caller. Any positional or keyword arguments are accepted and ignored.

cpu

→Self

cpu()

source edit

Move all parameters and buffers to CPU.

double

→Self

double()

source edit

Cast all parameters and buffers to float64.

eval

→Self

eval()

source edit

Set this module and all children to evaluation mode.

extra_repr

→str

extra_repr()

source edit

Override to add extra repr info (e.g. Linear shows in_features, etc.).

float

→Self

float()

source edit

Cast all parameters and buffers to float32.

forward

→_ModuleOutput

forward(args: Tensor = (), kwargs: object = {})

source edit

Override in subclasses to define the computation.

get_buffer

→Tensor

get_buffer(target: str)

source edit

Return buffer at dotted path, e.g. 'bn.running_mean'.

get_extra_state

→object

get_extra_state()

source edit

Return extra state to include in state_dict. Override in subclasses.

get_parameter

→Parameter

get_parameter(target: str)

source edit

Return parameter at dotted path, e.g. 'fc.weight'.

get_submodule

→Module

get_submodule(target: str)

source edit

Return submodule at dotted path, e.g. 'encoder.layer.0'.

half

→Self

half()

source edit

Cast all parameters and buffers to float16.

load_state_dict

→object

load_state_dict(state_dict: dict[str, Tensor], strict: bool = True, assign: bool = False)

source edit

Load parameters from a state_dict.

Calls each module's _load_from_state_dict recursively. Returns _IncompatibleKeys(missing_keys, unexpected_keys) on success. Raises RuntimeError if strict=True and any keys are missing or unexpected, or if any error_msgs accumulated during loading.

Parameters

state_dictdict

A mapping from parameter/buffer names to tensors.

strictbool= True

If True (default) require an exact key match; raise on any missing or unexpected keys.

assignbool= False

If True replace each parameter/buffer object with the loaded tensor directly (allows shape/dtype changes). If False (default) copy data into the existing parameter preserving its dtype and device.

metal

→Self

metal()

source edit

Move all parameters and buffers to Apple Metal GPU.

modules

→Iterator[Module]

modules()

source edit

Yield this module and all submodules (depth-first).

named_buffers

→Iterator[tuple[str, Tensor]]

named_buffers(prefix: str = '', recurse: bool = True, remove_duplicate: bool = True)

source edit

Yield (name, buffer) pairs.

named_children

→Iterator[tuple[str, Module]]

named_children()

source edit

Yield (name, child_module) pairs.

named_modules

→Iterator[tuple[str, Module]]

named_modules(memo: set[int] | None = None, prefix: str = '', remove_duplicate: bool = True)

source edit

Yield (name, module) pairs.

named_parameters

→Iterator[tuple[str, Parameter]]

named_parameters(prefix: str = '', recurse: bool = True, remove_duplicate: bool = True)

source edit

Yield (qualified_name, Parameter) pairs from this module's tree.

Parameters

prefixstr= ''

String prepended to every yielded name — used internally by the recursive walk to namespace child modules. Default "".

recursebool= True

When True (default), descend into submodules. When False, yield only this module's directly-attached parameters.

remove_duplicatebool= True

When True (default), each unique Parameter object is yielded only once even if referenced by multiple attributes — matches the reference framework's contract.

parameters

→Iterator[Parameter]

parameters(recurse: bool = True)

source edit

Yield all Parameters in this module (and children if recurse=True).

register_backward_hook

→RemovableHandle

register_backward_hook(hook: _BackwardHook)

source edit

Deprecated alias for register_full_backward_hook.

register_buffer

→None

register_buffer(name: str, tensor: Tensor | None, persistent: bool = True)

source edit

register_forward_hook

→RemovableHandle

register_forward_hook(hook: _ForwardHook, prepend: bool = False, with_kwargs: bool = False, always_call: bool = False)

source edit

register_forward_pre_hook

→RemovableHandle

register_forward_pre_hook(hook: _ForwardPreHook, prepend: bool = False, with_kwargs: bool = False)

source edit

register_full_backward_hook

→RemovableHandle

register_full_backward_hook(hook: _BackwardHook, prepend: bool = False)

source edit

register_full_backward_pre_hook

→RemovableHandle

register_full_backward_pre_hook(hook: _BackwardHook, prepend: bool = False)

source edit

register_load_state_dict_post_hook

→RemovableHandle

register_load_state_dict_post_hook(hook: Callable[..., object])

source edit

Hook signature: hook(module, incompatible_keys) -> None.

register_load_state_dict_pre_hook

→RemovableHandle

register_load_state_dict_pre_hook(hook: Callable[..., object])

source edit

Hook signature: hook(module, state_dict, prefix, local_metadata, strict, missing_keys, unexpected_keys, error_msgs) -> None

The hook may mutate state_dict/missing/unexpected/error_msgs.

register_module

→None

register_module(name: str, module: Module | None)

source edit

Alias for add_module.

register_parameter

→None

register_parameter(name: str, param: Parameter | None)

source edit

set_extra_state

→None

set_extra_state(state: object)

source edit

Restore extra state loaded from state_dict. Override in subclasses.

share_memory

→Self

share_memory()

source edit

No-op on Apple Silicon (unified memory is always shared).

state_dict

→dict

state_dict(destination: dict[str, Tensor] | None = None, prefix: str = '', keep_vars: bool = False)

source edit

Return an ordered dict mapping parameter / buffer names to tensors.

Includes every learnable parameter, every persistent buffer (persistent=True at register time), and every nested submodule's contribution under a dotted-path prefix.

The returned OrderedDict carries a _metadata attribute mapping module_path → {"version": int} for every module that defines a _version class attribute. lucid.save preserves this attribute across disk round-trips so version-aware _load_from_state_dict hooks can migrate older checkpoints.

Parameters

destinationdict= None

When supplied, results are written into this dict instead of a fresh one. Useful when composing state dicts from multiple modules. Default None.

prefixstr= ''

String prepended to every key — used internally by the recursive walk to namespace child modules. Default "".

keep_varsbool= False

When True keep tensors attached to autograd (return them as-is); when False (default) detach them so the state dict is safe to serialise / cross threads.

Returns

dict

OrderedDict mapping qualified parameter / buffer paths to their tensor values.

to

→Self

to(args: object = (), kwargs: object = {})

source edit

Move/cast all parameters and buffers, preserving Parameter object identity.

Floating-point dtype casts (.float(), .double(), .half(), .bfloat16()) skip integer buffers — e.g. BatchNorm.num_batches_tracked stays int64 — matching the reference framework so checkpoint round-trips don't quietly widen / narrow the counter type. Device moves still apply to every tensor.

to_empty

→Self

to_empty(device: object = None, recurse: bool = True)

source edit

Move parameters and buffers to device without copying data.

The reference framework uses to_empty to materialise a model originally constructed on the meta device. Lucid has no meta device, so this method exists for API parity and delegates to the standard to when a device is supplied.

Parameters

deviceobject= None

Target placement (string, device, or engine enum). When None (default) the call is a no-op returning self.

recursebool= True

Honoured by the underlying to call. Default True — every child module is moved as well.

Returns

Self

The same module, parameters / buffers now on device.

train

→Self

train(mode: bool = True)

source edit

Set this module and all children to training mode.

type

→Self

type(dst_type: object)

source edit

Cast all parameters and buffers to dst_type.

dst_type may be a lucid.dtype, a Python type (float, int), or a string ("float32", "float16", etc.). Delegates to to, which handles the conversion.

zero_grad

→None

zero_grad(set_to_none: bool = True)

source edit

Zero gradients of all parameters.

In-place ops

requires_grad_

→Self

requires_grad_(requires_grad: bool = True)

source edit

Set requires_grad for all parameters.

Dunder methods

dunder

repr

→str

__repr__()

source edit

Return a developer-facing string representation of the instance.

>>> class MLP(nn.Module): ... def __init__(self): ... super().__init__() ... self.fc1 = nn.Linear(10, 20) ... self.fc2 = nn.Linear(20, 1) ... ... def forward(self, x): ... return self.fc2(lucid.relu(self.fc1(x))) ... >>> model = MLP() >>> model(lucid.randn(4, 10)).shape (4, 1)

state_dict

→dict

state_dict(destination: dict[str, Tensor] | None = None, prefix: str = '', keep_vars: bool = False)

source edit

Return an ordered dict mapping parameter / buffer names to tensors.

Includes every learnable parameter, every persistent buffer (persistent=True at register time), and every nested submodule's contribution under a dotted-path prefix.

Parameters

destinationdict= None

When supplied, results are written into this dict instead of a fresh one. Useful when composing state dicts from multiple modules. Default None.

prefixstr= ''

String prepended to every key — used internally by the recursive walk to namespace child modules. Default "".

keep_varsbool= False

When True keep tensors attached to autograd (return them as-is); when False (default) detach them so the state dict is safe to serialise / cross threads.

Returns

dict

OrderedDict mapping qualified parameter / buffer paths to their tensor values.