conv2d
→Tensorconv2d(x: Tensor, weight: Tensor, bias: Tensor | None = None, stride: int | tuple[int, int] = 1, padding: int | tuple[int, int] = 0, dilation: int | tuple[int, int] = 1, groups: int = 1)2-D cross-correlation over batched 4-D input.
Despite the name, computes cross-correlation rather than strict
mathematical convolution (no kernel flip). The kernel slides over
the input applying a learned linear combination at each spatial
position; channel mixing happens through the in_channels
dimension of weight. This is the workhorse op of modern image
CNNs (ResNet, ConvNeXt, EfficientNet, ...).
Parameters
xTensor(N, C_in, H, W).weightTensor(C_out, C_in/groups, kH, kW).biasTensor= None(C_out,).strideint or (int, int)= 11).paddingint or (int, int)= 0dilationint or (int, int)= 11.groupsint= 1groups independent groups. Setting
groups == C_in gives a depthwise convolution.Returns
TensorOutput of shape (N, C_out, H_out, W_out) where
Notes
Math:
Backward is well-known; gradients w.r.t. x, weight, and
bias flow through automatically. groups > 1 yields grouped
convolution (channel-blocks computed independently); groups == C_in plus C_out == C_in is depthwise convolution. Dilation
enlarges the receptive field without inflating parameter count.
Examples
>>> import lucid
>>> from lucid.nn.functional import conv2d
>>> x = lucid.randn(1, 3, 32, 32)
>>> w = lucid.randn(8, 3, 3, 3)
>>> y = conv2d(x, w, stride=1, padding=1)
>>> y.shape
(1, 8, 32, 32)