class
RMSprop
extends
OptimizerRMSprop(params: Iterable[Parameter] | Iterable[dict[str, object]], lr: float = 0.01, alpha: float = 0.99, eps: float = 1e-08, weight_decay: float = 0, momentum: float = 0, centered: bool = False)Root Mean Square Propagation optimizer.
RMSprop maintains a running average of the squared gradient for each parameter and divides the gradient by the square root of that average, which normalises the effective learning rate per parameter:
With momentum an additional velocity buffer is maintained:
Parameters
paramsiterable of Parameter or iterable of dictParameters to optimise, or a list of parameter-group dicts.
lrfloat= 0.01Learning rate (default:
1e-2).alphafloat= 0.99Smoothing constant for the running squared-gradient
average (default:
0.99).epsfloat= 1e-08Term added to the denominator for numerical
stability (default:
1e-8).weight_decayfloat= 0L2 regularisation coefficient (default:
0).momentumfloat= 0Momentum factor (default:
0).centeredbool= FalseIf
True, normalise by the estimated variance rather than the
raw second moment (default: False).Attributes
param_groupslist of dictParameter groups with keys
"params", "lr", "alpha",
"eps", "weight_decay", "momentum", and "centered".defaultsdictDefault hyperparameter values.
Notes
RMSprop was proposed as an unpublished improvement for non-stationary objectives and recurrent networks. It works well with a moderate learning rate and is often used for reinforcement learning tasks.
Examples
>>> import lucid.optim as optim
>>> optimizer = optim.RMSprop(model.parameters(), lr=1e-3, alpha=0.99)
>>> optimizer.zero_grad()
>>> loss.backward()
>>> optimizer.step()Methods (2)
dunder
__init__
→None__init__(params: Iterable[Parameter] | Iterable[dict[str, object]], lr: float = 0.01, alpha: float = 0.99, eps: float = 1e-08, weight_decay: float = 0, momentum: float = 0, centered: bool = False)Initialise the RMSprop. See the class docstring for parameter semantics.
fn
step
→Tensor | Nonestep(closure: _OptimizerClosure = None)Perform a single RMSprop step.