class

StepLR

extends_LRScheduler
StepLR(optimizer: Optimizer, step_size: int, gamma: float = 0.1, last_epoch: int = -1, verbose: bool = False)
source

Decay the learning rate by a fixed multiplicative factor every fixed number of epochs.

At each epoch that is a multiple of step_size, every param-group learning rate is multiplied by gamma:

ηt={ηt1×γif tmodstep_size=0ηt1otherwise\eta_t = \begin{cases} \eta_{t-1} \times \gamma & \text{if } t \bmod \text{step\_size} = 0 \\ \eta_{t-1} & \text{otherwise} \end{cases}

Parameters

optimizerOptimizer
Wrapped optimizer.
step_sizeint
Period (in epochs) between learning rate decays.
gammafloat= 0.1
Multiplicative factor applied at each decay step (default: 0.1).
last_epochint= -1
The index of the last epoch (default: -1).
verbosebool= False
Print the updated LR after each step if True (default: False).

Attributes

step_sizeint
Decay period in epochs.
gammafloat
Multiplicative decay factor.

Notes

After k full decay steps the effective learning rate is:

ηk=η0γt/step_size\eta_k = \eta_0 \cdot \gamma^{\lfloor t / \text{step\_size} \rfloor}

A common choice is step_size=30, gamma=0.1 for image classification tasks trained for 90 epochs.

Examples

>>> import lucid.optim as optim
>>> optimizer = optim.SGD(model.parameters(), lr=0.1)
>>> scheduler = optim.StepLR(optimizer, step_size=30, gamma=0.1)
>>> for epoch in range(90):
...     train(...)
...     optimizer.step()
...     scheduler.step()

Methods (2)

dunder

__init__

None
__init__(optimizer: Optimizer, step_size: int, gamma: float = 0.1, last_epoch: int = -1, verbose: bool = False)
source

Initialise the StepLR. See the class docstring for parameter semantics.

fn

get_lr

list[float]
get_lr()
source

Compute the learning rate for each parameter group at the current step.

Returns

list[float]

One learning rate per param group, derived from the schedule formula documented in the class docstring.