class

LinearLR

extends_LRScheduler
LinearLR(optimizer: Optimizer, start_factor: float = 1.0 / 3, end_factor: float = 1.0, total_iters: int = 5, last_epoch: int = -1, verbose: bool = False)
source

Linearly interpolate the learning rate from a start factor to an end factor.

The learning rate is scaled by a factor that changes linearly from start_factor to end_factor over total_iters steps:

factor(t)=start_factor+(end_factorstart_factor)min(t,T)T\text{factor}(t) = \text{start\_factor} + \frac{(\text{end\_factor} - \text{start\_factor}) \cdot \min(t,\, T)}{T} ηt=η0factor(t)\eta_t = \eta_0 \cdot \text{factor}(t)

where T=total_itersT = \text{total\_iters}.

Parameters

optimizerOptimizer
Wrapped optimizer.
start_factorfloat= 1.0 / 3
The multiplier applied to the base LR at epoch 0 (default: 1/3).
end_factorfloat= 1.0
The multiplier applied to the base LR at epoch total_iters (default: 1.0).
total_itersint= 5
Number of steps over which the interpolation runs (default: 5).
last_epochint= -1
The index of the last epoch (default: -1).
verbosebool= False
Print the updated LR after each step if True (default: False).

Attributes

start_factorfloat
Initial LR multiplier.
end_factorfloat
Final LR multiplier.
total_itersint
Interpolation length in epochs.

Notes

LinearLR is commonly used for linear warmup: set start_factor to a small value (e.g. 1/total_iters) and end_factor=1.0 so the learning rate ramps up to its nominal value over the first few epochs.

Examples

>>> import lucid.optim as optim
>>> optimizer = optim.AdamW(model.parameters(), lr=1e-3)
>>> # Warm up over 5 epochs: LR goes from lr/3 to lr
>>> scheduler = optim.LinearLR(
...     optimizer, start_factor=1/3, end_factor=1.0, total_iters=5
... )
>>> for epoch in range(50):
...     train(...)
...     optimizer.step()
...     scheduler.step()

Methods (2)

dunder

__init__

None
__init__(optimizer: Optimizer, start_factor: float = 1.0 / 3, end_factor: float = 1.0, total_iters: int = 5, last_epoch: int = -1, verbose: bool = False)
source

Initialise the LinearLR. See the class docstring for parameter semantics.

fn

get_lr

list[float]
get_lr()
source

Compute the learning rate for each parameter group at the current step.

Returns

list[float]

One learning rate per param group, derived from the schedule formula documented in the class docstring.