Adam

class Adam(params, lr, betas=(0.9, 0.999), eps=1e-08, weight_decay=0.0)[source]

Implements Adam algorithm proposed in “Adam: A Method for Stochastic Optimization”.

Parameters
  • params (Union[Iterable[Parameter], dict]) – iterable of parameters to optimize or dicts defining parameter groups.

  • lr (float) – learning rate.

  • betas (Tuple[float, float]) – coefficients used for computing running averages of gradient and its square. Default: (0.9, 0.999).

  • eps (float) – term added to the denominator to improve numerical stability. Default: 1e-8.

  • weight_decay (float) – weight decay (L2 penalty). Default: 0.

Returns

An instance of the Adam optimizer.