class
CosineAnnealingLR
extends
_LRSchedulerCosineAnnealingLR(optimizer: Optimizer, T_max: int, eta_min: float = 0, last_epoch: int = -1, verbose: bool = False)Anneal the learning rate following a cosine curve over T_max epochs.
The learning rate at epoch is:
where is the initial learning rate captured from the
optimizer (base_lr) and is the floor set by
eta_min.
Parameters
optimizerOptimizerWrapped optimizer.
T_maxintHalf-period of the cosine cycle in epochs. After
T_max epochs
the learning rate reaches eta_min.eta_minfloat= 0Minimum learning rate (default:
0).last_epochint= -1The index of the last epoch (default:
-1).verbosebool= FalsePrint the updated LR after each step if
True (default: False).Attributes
T_maxintDecay period in epochs.
eta_minfloatLower bound on the learning rate.
Notes
Cosine annealing produces a smooth, monotonically decreasing schedule
that starts fast and slows near the minimum. It is widely used for
training deep networks and pairs naturally with warm-restarts
(see CosineAnnealingWarmRestarts).
Examples
>>> import lucid.optim as optim
>>> optimizer = optim.SGD(model.parameters(), lr=0.1)
>>> scheduler = optim.CosineAnnealingLR(optimizer, T_max=100, eta_min=1e-5)
>>> for epoch in range(100):
... train(...)
... optimizer.step()
... scheduler.step()Methods (2)
dunder
__init__
→None__init__(optimizer: Optimizer, T_max: int, eta_min: float = 0, last_epoch: int = -1, verbose: bool = False)Initialise the CosineAnnealingLR. See the class docstring for parameter semantics.
fn
get_lr
→list[float]get_lr()Compute the learning rate for each parameter group at the current step.
Returns
list[float]One learning rate per param group, derived from the schedule formula documented in the class docstring.