megengine.optimizer

>>> import megengine.optimizer as optim

Optimizer

Base class for all optimizers.

常见优化器

SGD

Implements stochastic gradient descent.

AdamW

Implements the AdamW algorithm proposed in "Decoupled Weight Decay Regularization".

Adam

Implements Adam algorithm proposed in "Adam: A Method for Stochastic Optimization".

Adagrad

Implements Adagrad algorithm proposed in "Adaptive Subgradient Methods for Online Learning and Stochastic Optimization".

Adadelta

Implements Adadelta algorithm proposed in "ADADELTA: An Adaptive Learning Rate Method".

LAMB

Implements LAMB algorithm.

LAMBFp16

学习率调整

LRScheduler

Base class for all learning rate based schedulers.

MultiStepLR

Decays the learning rate of each parameter group by gamma once the

梯度处理

clip_grad_norm

Clips gradient norm of an iterable of parameters.

clip_grad_value

Clips gradient of an iterable of parameters to a specified lower and upper.