megengine.functional.nn.roi_pooling¶
- roi_pooling(inp, rois, output_shape, mode='max', scale=1.0)[source]¶
Applies RoI (Region of Interest) pooling on input feature, as described in Faster RCNN.
- Parameters
inp (
Tensor
) – the input tensor that represents the input feature with(n, c, h, w)
shape.rois (
Tensor
) – a tensor represents Regions of Interest with shape(K, 5)
, which means totalK
box coordinates in(idx, x1, y1, x2, y2)
format where the regions will be taken from. The coordinate including(x1, y1)
and(x2, y2)
must satisfy0 <= x1 < x2
and0 <= y1 < y2
. The first columnidx
should contain the index of the corresponding element in the input batch, i.e. a number in[0, n - 1]
.mode (
str
) – “max” or “average”, the pooling mode to be used. Default: “max”scale (
float
) – It is a scale that maps output rois feature to input feature. For example, if the output is 224 * 224 image, and the input is a 112 * 112 feature map, then the scale should be set to 0.5. The default value is 1.0
- Return type
- Returns
output tensor.
(K, C, output_shape[0], output_shape[1])
feature of rois.
Examples
>>> import numpy as np >>> np.random.seed(42) >>> inp = Tensor(np.random.randn(1, 1, 128, 128)) >>> rois = Tensor(np.random.random((4, 5))) >>> y = F.vision.roi_pooling(inp, rois, (2, 2)) >>> y.numpy()[0].round(decimals=4) array([[[-0.1383, -0.1383], [-0.5035, -0.5035]]], dtype=float32)