class

RMSprop

extendsOptimizer

RMSprop(params: Iterable[Parameter] | Iterable[dict[str, object]], lr: float = 0.01, alpha: float = 0.99, eps: float = 1e-08, weight_decay: float = 0, momentum: float = 0, centered: bool = False)

source

Root Mean Square Propagation optimizer.

RMSprop maintains a running average of the squared gradient for each parameter and divides the gradient by the square root of that average, which normalises the effective learning rate per parameter:

\begin{aligned} v_t &= \alpha \, v_{t-1} + (1 - \alpha) \, g_t^2 \\ \theta_t &= \theta_{t-1} - \frac{\eta}{\sqrt{v_t} + \epsilon} \, g_t \end{aligned}

With momentum an additional velocity buffer $b$ is maintained:

\begin{aligned} b_t &= \mu \, b_{t-1} + \frac{\eta}{\sqrt{v_t} + \epsilon} \, g_t \\ \theta_t &= \theta_{t-1} - b_t \end{aligned}

Parameters

paramsiterable of Parameter or iterable of dict

Parameters to optimise, or a list of parameter-group dicts.

lrfloat= 0.01

Learning rate

\eta

(default: 1e-2).

alphafloat= 0.99

Smoothing constant

\alpha

for the running squared-gradient average (default: 0.99).

epsfloat= 1e-08

Term

\epsilon

added to the denominator for numerical stability (default: 1e-8).

weight_decayfloat= 0

L2 regularisation coefficient (default: 0).

momentumfloat= 0

Momentum factor

\mu

(default: 0).

centeredbool= False

If True, normalise by the estimated variance rather than the raw second moment (default: False).

Attributes

param_groupslist of dict

Parameter groups with keys "params", "lr", "alpha", "eps", "weight_decay", "momentum", and "centered".

defaultsdict

Default hyperparameter values.

Notes

RMSprop was proposed as an unpublished improvement for non-stationary objectives and recurrent networks. It works well with a moderate learning rate and is often used for reinforcement learning tasks.

Examples

>>> import lucid.optim as optim
>>> optimizer = optim.RMSprop(model.parameters(), lr=1e-3, alpha=0.99)
>>> optimizer.zero_grad()
>>> loss.backward()
>>> optimizer.step()

Methods (2)

dunder

init

→None

__init__(params: Iterable[Parameter] | Iterable[dict[str, object]], lr: float = 0.01, alpha: float = 0.99, eps: float = 1e-08, weight_decay: float = 0, momentum: float = 0, centered: bool = False)

source

Initialise the RMSprop. See the class docstring for parameter semantics.

step

→Tensor | None

step(closure: _OptimizerClosure = None)

source

Perform a single RMSprop step.