megengine.optimizer¶

>>> import megengine.optimizer as optim

Base class for all optimizers.

常见优化器¶

`SGD`	Implements stochastic gradient descent.
`AdamW`	Implements the AdamW algorithm proposed in "Decoupled Weight Decay Regularization".
`Adam`	Implements Adam algorithm proposed in "Adam: A Method for Stochastic Optimization".
`Adagrad`	Implements Adagrad algorithm proposed in "Adaptive Subgradient Methods for Online Learning and Stochastic Optimization".
`Adadelta`	Implements Adadelta algorithm proposed in "ADADELTA: An Adaptive Learning Rate Method".
`LAMB`	Implements LAMB algorithm.
`LAMBFp16`

`LRScheduler`	Base class for all learning rate based schedulers.
`MultiStepLR`	Decays the learning rate of each parameter group by gamma once the

`clip_grad_norm`	Clips gradient norm of an iterable of parameters.
`clip_grad_value`	Clips gradient of an iterable of parameters to a specified lower and upper.

GradManager

Optimizer