BatchNorm2d¶
- class BatchNorm2d(num_features, eps=1e-05, momentum=0.9, affine=True, track_running_stats=True, freeze=False, **kwargs)[source]¶
Applies Batch Normalization over a 4D tensor.
\[y = \frac{x - \mathrm{E}[x]}{ \sqrt{\mathrm{Var}[x] + \epsilon}} * \gamma + \beta\]The mean and standard-deviation are calculated per-dimension over the mini-batches and \(\gamma\) and \(\beta\) are learnable parameter vectors.
By default, during training this layer keeps running estimates of its computed mean and variance, which are then used for normalization during evaluation. The running estimates are kept with a default
momentum
of 0.9.If
track_running_stats
is set toFalse
, this layer will not keep running estimates, batch statistics is used during evaluation time instead.Because the Batch Normalization is done over the C dimension, computing statistics on (N, H, W) slices, it’s common terminology to call this Spatial Batch Normalization.
Note
The update formula for
running_mean
andrunning_var
(takingrunning_mean
as an example) is\[\textrm{running_mean} = \textrm{momentum} \times \textrm{running_mean} + (1 - \textrm{momentum}) \times \textrm{batch_mean}\]which could be defined differently in other frameworks. Most notably,
momentum
of 0.1 in PyTorch is equivalent tomementum
of 0.9 here.- Parameters
num_features – usually \(C\) from an input of shape \((N, C, H, W)\) or the highest ranked dimension of an input less than 4D.
eps – a value added to the denominator for numerical stability. Default: 1e-5
momentum – the value used for the
running_mean
andrunning_var
computation. Default: 0.9affine – a boolean value that when set to True, this module has learnable affine parameters. Default: True
track_running_stats – when set to True, this module tracks the running mean and variance. When set to False, this module does not track such statistics and always uses batch statistics in both training and eval modes. Default: True
freeze – when set to True, this module does not update the running mean and variance, and uses the running mean and variance instead of the batch mean and batch variance to normalize the input. The parameter takes effect only when the module is initilized with track_running_stats as True. Default: False
- Shape:
Input: \((N, C, H, W)\)
Output: \((N, C, H, W)\) (same shape as input)
Examples
>>> import numpy as np >>> # With Learnable Parameters >>> m = M.BatchNorm2d(4) >>> inp = mge.tensor(np.random.rand(1, 4, 3, 3).astype("float32")) >>> oup = m(inp) >>> print(m.weight.numpy().flatten(), m.bias.numpy().flatten()) [1. 1. 1. 1.] [0. 0. 0. 0.] >>> # Without Learnable Parameters >>> m = M.BatchNorm2d(4, affine=False) >>> oup = m(inp) >>> print(m.weight, m.bias) None None