megengine.optimizer¶
>>> import megengine.optimizer as optim
Base class for all optimizers. |
常见优化器¶
Implements stochastic gradient descent. |
|
Implements the AdamW algorithm proposed in "Decoupled Weight Decay Regularization". |
|
Implements Adam algorithm proposed in "Adam: A Method for Stochastic Optimization". |
|
Implements Adagrad algorithm proposed in "Adaptive Subgradient Methods for Online Learning and Stochastic Optimization". |
|
Implements Adadelta algorithm proposed in "ADADELTA: An Adaptive Learning Rate Method". |
|
Implements LAMB algorithm. |
|
学习率调整¶
Base class for all learning rate based schedulers. |
|
Decays the learning rate of each parameter group by gamma once the |
梯度处理¶
Clips gradient norm of an iterable of parameters. |
|
Clips gradient of an iterable of parameters to a specified lower and upper. |