megengine.functional.nn.cross_entropy¶
- cross_entropy(pred, label, axis=1, with_logits=True, label_smooth=0, reduction='mean')[source]¶
Computes the multi-class cross entropy loss (using logits by default).
When using label smoothing, the label distribution is as follows:
\[y^{LS}_{k}=y_{k}\left(1-\alpha\right)+\alpha/K\]where \(y^{LS}\) and \(y\) are new label distribution and origin label distribution respectively. k is the index of label distribution. \(\alpha\) is
label_smooth
and \(K\) is the number of classes.- Parameters
pred (
Tensor
) – input tensor representing the predicted value.label (
Tensor
) – input tensor representing the classification label.axis (
int
) – an axis along which softmax will be applied. Default: 1with_logits (
bool
) – whether to apply softmax first. Default: Truelabel_smooth (
float
) – a label smoothing of parameter that can re-distribute target distribution. Default: 0reduction (
str
) – the reduction to apply to the output: ‘none’ | ‘mean’ | ‘sum’.
- Return type
- Returns
loss value.
Examples
By default(
with_logitis
is True),pred
is assumed to be logits, class probabilities are given by softmax. It has better numerical stability compared with sequential calls tosoftmax
andcross_entropy
.>>> pred = Tensor([[0., 1.], [0.3, 0.7], [0.7, 0.3]]) >>> label = Tensor([1., 1., 1.]) >>> F.nn.cross_entropy(pred, label) Tensor(0.57976407, device=xpux:0) >>> F.nn.cross_entropy(pred, label, reduction="none") Tensor([0.3133 0.513 0.913 ], device=xpux:0)
If the
pred
value has been probabilities, setwith_logits
to False:>>> pred = Tensor([[0., 1.], [0.3, 0.7], [0.7, 0.3]]) >>> label = Tensor([1., 1., 1.]) >>> F.nn.cross_entropy(pred, label, with_logits=False) Tensor(0.5202159, device=xpux:0) >>> F.nn.cross_entropy(pred, label, with_logits=False, reduction="none") Tensor([0. 0.3567 1.204 ], device=xpux:0)