nn.init.kaiming_uniform¶
The kaiming_uniform function initializes the input tensor with values sampled from a uniform distribution \(U(-\text{bound}, \text{bound})\), where the bound is calculated to maintain a stable variance of activations in the layer.
This initialization method is well-suited for layers using ReLU or other non-linear activation functions.
Function Signature¶
def kaiming_uniform(tensor: Tensor, mode: _FanMode = "fan_in") -> None
Parameters¶
tensor (
Tensor
): The tensor to be initialized. The shape of the tensor determines the fan-in and fan-out for the initialization.mode (_FanMode, optional): Determines whether to use fan_in or fan_out for computing the bound. Defaults to fan_in.
Returns¶
None: The function modifies the tensor in-place with new values sampled from the uniform distribution.
Forward Calculation¶
The values in the tensor are sampled from a uniform distribution \(U(-\text{bound}, \text{bound})\), where the bound is calculated as:
Where \(\text{fan}\) is determined by the mode parameter:
If mode=”fan_in”, then \(\text{fan} = \text{fan\_in}\) where \(\text{fan\_in}\) is the number of input units in the weight tensor.
If mode=”fan_out”, then \(\text{fan} = \text{fan\_out}\) where \(\text{fan\_out}\) is the number of output units in the weight tensor.
Examples¶
Basic Kaiming Uniform Initialization
>>> import lucid
>>> from lucid.nn.init import kaiming_uniform
>>> tensor = lucid.zeros((3, 2))
>>> kaiming_uniform(tensor)
>>> print(tensor)
Tensor([[ 0.423, -0.234],
[ 0.342, -0.678],
[ 0.678, 0.123]], requires_grad=False)
Kaiming Uniform Initialization with fan_out mode
>>> tensor = lucid.zeros((4, 4))
>>> kaiming_uniform(tensor, mode="fan_out")
>>> print(tensor)
Tensor([[ 0.563, -0.342, 0.421, -0.678],
[-0.321, 0.654, -0.276, 0.345],
[ 0.876, 0.124, -0.563, -0.234],
[ 0.543, -0.234, 0.657, -0.421]], requires_grad=False)
Note
Kaiming initialization is best suited for layers with ReLU or similar non-linear activations.
For layers with tanh or sigmoid activations, consider using Xavier Initialization instead for better performance.