class

Rprop

extendsOptimizer

Rprop(params: Iterable[Parameter] | Iterable[dict[str, object]], lr: float = 0.01, etas: tuple[float, float] = (0.5, 1.2), step_sizes: tuple[float, float] = (1e-06, 50))

source

Resilient Backpropagation optimizer.

Rprop ignores gradient magnitudes and adapts a per-parameter step size based only on the sign of the gradient. If the sign of the gradient does not change between steps the step size is increased; if the sign reverses the step size is decreased:

\Delta_{i,t} = \begin{cases} \min(\Delta_{i,t-1} \cdot \eta^+,\; \Delta_{\max}) & \text{if } g_{i,t} \cdot g_{i,t-1} > 0 \\ \max(\Delta_{i,t-1} \cdot \eta^-,\; \Delta_{\min}) & \text{if } g_{i,t} \cdot g_{i,t-1} < 0 \\ \Delta_{i,t-1} & \text{otherwise} \end{cases}

The parameter update is then:

\theta_{i,t} = \theta_{i,t-1} - \operatorname{sign}(g_{i,t}) \cdot \Delta_{i,t}

Parameters

paramsiterable of Parameter or iterable of dict

Parameters to optimise, or a list of parameter-group dicts.

lrfloat= 0.01

Initial step size for each parameter (default: 1e-2).

etastuple of float= (0.5, 1.2)

Multiplicative decrease and increase factors

(\eta^-, \eta^+)

(default: (0.5, 1.2)).

step_sizestuple of float= (1e-06, 50)

Minimum and maximum allowed step sizes

(\Delta_{\min}, \Delta_{\max})

(default: (1e-6, 50)).

Attributes

param_groupslist of dict

Parameter groups with keys "params", "lr", "eta_minus", "eta_plus", "step_min", and "step_max".

defaultsdict

Default hyperparameter values.

Notes

Rprop is particularly effective for full-batch training because the sign-based update is well-defined when gradients are deterministic. For stochastic mini-batch training RMSprop or Adam are generally preferred.

Examples

>>> import lucid.optim as optim
>>> optimizer = optim.Rprop(model.parameters(), lr=1e-2, etas=(0.5, 1.2))
>>> optimizer.zero_grad()
>>> loss.backward()
>>> optimizer.step()

Methods (2)

dunder

init

→None

__init__(params: Iterable[Parameter] | Iterable[dict[str, object]], lr: float = 0.01, etas: tuple[float, float] = (0.5, 1.2), step_sizes: tuple[float, float] = (1e-06, 50))

source

Initialise the Rprop. See the class docstring for parameter semantics.

step

→Tensor | None

step(closure: _OptimizerClosure = None)

source

Perform a single Rprop step.