class

CyclicLR

extends_LRScheduler

CyclicLR(optimizer: Optimizer, base_lr: float, max_lr: float, step_size_up: int = 2000, mode: str = 'triangular', gamma: float = 1.0, last_epoch: int = -1, verbose: bool = False)

source

Cycle the learning rate between base_lr and max_lr.

Implements the triangular, triangular2, and exp_range cyclic policies. Within each cycle of length $2 \times \text{step\_size\_up}$ the learning rate rises linearly from base_lr to max_lr and then falls back:

\begin{aligned} \text{cycle} &= \left\lfloor 1 + \frac{t}{2 \cdot s} \right\rfloor \\ x &= \left|\frac{t}{s} - 2 \cdot \text{cycle} + 1\right| \\ \text{scale} &= \max(0,\; 1 - x) \\ \eta_t &= \eta_{\min} + (\eta_{\max} - \eta_{\min}) \cdot \text{scale} \end{aligned}

where $s = \text{step\_size\_up}$ .

For mode="triangular2" the amplitude halves each cycle:

\text{scale} \mathrel{/}= 2^{\text{cycle}-1}

For mode="exp_range" the amplitude decays exponentially each step:

\text{scale} \mathrel{\times}= \gamma^{t}

Parameters

optimizerOptimizer

Wrapped optimizer.

base_lrfloat

Lower boundary of the learning rate cycle.

max_lrfloat

Upper boundary of the learning rate cycle.

step_size_upint= 2000

Number of steps in the increasing half of the cycle (default: 2000).

modestr= 'triangular'

One of "triangular" (constant amplitude), "triangular2" (amplitude halves each cycle), or "exp_range" (amplitude decays by

\gamma^t

each step). Default: "triangular".

gammafloat= 1.0

Decay factor used only in "exp_range" mode (default: 1.0).

last_epochint= -1

The index of the last epoch (default: -1).

verbosebool= False

Print the updated LR after each step if True (default: False).

Attributes

base_lr_valfloat

Lower bound of the cycle.

max_lr_valfloat

Upper bound of the cycle.

step_size_upint

Half-cycle length in steps.

modestr

Scaling policy name.

gammafloat

Decay factor for "exp_range" mode.

Notes

Cyclic learning rates can reduce the need for careful manual tuning by automatically exploring a range of rates. Use step_size_up between 2 and 10 times the number of iterations per epoch.

Examples

>>> import lucid.optim as optim
>>> optimizer = optim.SGD(model.parameters(), lr=0.01)
>>> scheduler = optim.CyclicLR(
...     optimizer, base_lr=1e-4, max_lr=1e-2, step_size_up=500
... )
>>> for batch in dataloader:
...     train_step(batch)
...     optimizer.step()
...     scheduler.step()

Methods (2)

dunder

init

→None

__init__(optimizer: Optimizer, base_lr: float, max_lr: float, step_size_up: int = 2000, mode: str = 'triangular', gamma: float = 1.0, last_epoch: int = -1, verbose: bool = False)

source

Initialise the CyclicLR. See the class docstring for parameter semantics.

get_lr

→list[float]

get_lr()

source

Compute the learning rate for each parameter group at the current step.

Returns

list[float]

One learning rate per param group, derived from the schedule formula documented in the class docstring.